m MrS 


ournal of Applied Psychology 


University of Minnesota’ 


, 


a . 
Consulting Editors 

George K. Bennett. Psychological Corporation James P. Porter, Danville, Illinois 
feo IE) Burtt, Ohio Staid University . paola F. Rothe Amer. Hosp. Supply Corp., 
Allen L. Edwards, aniria £ A Julian B. Rotter, Ohio State University 
Clifford E. Jurgensen, inneapolis jas Co. Edward K. Strong, Jr., Stanford Unies 
Irving Lorge, T. C. Columbia University Donald E. Super, T. C. Columbia University 
Quinn McNemar, Stanford University Morris S. Viteles, University of Pennsylvania 
Alexander Mintz, City College of New York Alfred C. Welch, Knox- 


Reeves, M: inneapolis 


Volume 36, 1952 


Published Bi-monthly by the American Psychological A 


Ssociation, Inc. 
Prince and Lemon Sts., Lancaster, Pa, 


Entered as Second-class matter, August 19, 1943, at the Post office at Lancaster, Pa, 


» under the Act of March 3, 1879 
Acceptance for maili t th ial rate of postage provided for in ar: h (d- i 

Rae Scenes oe Staune special ra 1948, authorized October 10, 1947 P} (42), Section 34.40, 
. 


Copyright, ‘1952, by The American Psychological Association, Inc, 


t: 


Contents of Volume 36 


Articles 


of Aptitude Tests for Trainability and for Job Proficiency... 
Browne, C. G. and Neitzel, B. J. Communication, Supervision, and Morale 


Cozan, L. W. Note on Table for Use with Spearman-Brown Formula 
Darley, J. G. Rely to Eells’ Comment on Darley’s “Special Review”... | 
Darley, J. G., Gross, N., and Martin, W. C. Studies of Group Behavior; Factors 

Associated with the Productivity of GOUD een aeaiee ersa ketonc nanana 
Davis, K. and St. Germain, E. E. An Opinion of a Regional Union Group 
Dunnette, M. D. Accuracy of Students’ Reported Honor Point Averages..... 
Drucker, A. J. and Remmers, H. H. A Validation of the SRA Youth Inventory... 
Edwards, A. L. The Scaling of Stimuli by the Method of Successive Intervals... . 
Eells, K. Comment on Darley’e Speci] Reie s osizni mamnnaa sce. 
Farr, J. N., Jenkins, J. J., Paterson, D. G., and England, G. W. Reply to Klare and 

Flesch re “Simplification of Flesch Reading Ease Formula”... 79 ano 
Flesch, R. Reply to “Simplification of Flesch Reading Ease Formula” 


Frederiksen, N, and Sch rader, W. B. The ACE Psychological Examination and High 
School Standing as Predictors of College Success.......... 

Fruchter, B. Ability Patterns in Technical Training Criteria 

Fulk, B. E. and Harrell, T. W, Negro-White Army Test S 
Grad Ob sesasi i i encans 


Green, N. E. Opinions on Communism of Air Force Police Trainees ere ee eee 
Guilford, J.S. Temperament Traits of Executives and Supervisors Mezra 10 ic. 
Guilford Personality Inventories easured by the 


Gustad, J. W. Academic Achievement and Strong Occupational Level Scores. 
Hemphill, J. K. and Sechrest, L. B. A Comparison of Three Criter 
Effectiveness in Combat Over Korea... 0 TSt 
Harrison, R. and Jackson, T. A. Validation of a Clinical Approach $5 i POY Dow aceigu rn 
gl Se en 
ay, E.N. Some Research Findings with the Wonderlic Personnel Test $U UU 
cron, A. A Psychological Study of Occupational Adjust: 
Tan A. H. and Schutz, H. G. A Factor Analysis of a Salary Job Erao i 


J Pao W. L. and Olson, M. W. The Use of Levers in Maki Job Evaluation Plan 
cale 


i Contents of Volume 36 
iv 


Kephart, N. C. and Oliver, J. E. A Punched Card Procedure for Use with the 
Method of Paired Comparisons 


«et 
Kirchner, W., Lindbom, T., and Paterson, D. G. Attitudes Toward the empio TA 
t of OES COT ce ee T EERE CE OE TN k 
Klare G. R. A Note on “Simplification of Flesch Reading Ease Formula” . i S E 
eer R. W. The Relationship between Ortho-Rater Tests of Acui ya ‘ai 
Color Vision in a Senescent Groups: ranean sane. ERTE ponte es 03 
Krathwohl, W. C. Specificity of Over- and Under-Achievement in College Courses. 1 2 
Kriedt, P. H. Validation of a Correspondence Aptitude Test TETT k 5 5 uA i 
Kriedt, P. H., Stone, C. H., and Paterson, D. G. Vocational Interests of Indus 7 
WelationsyPersonnel ss... ees s ccveeves......., ae eerden enen 4 
Lawshe, C. H. and Cary, W. Verbalization and Learning a Manipulative Task... . 
Lawshe, C. H. and Deutsch, S. 


180 
The Interests of Industrial Psychology Students... 
Layton, W. L. Predicting Succ: 


ess of Students in Veterinary Medicine. . eter £33 31 
Lee, M. L. Relationship of Masculinity-Femininity to Tests of Mechanical and 377 
Semea ANR aah st me 49 
Levett, C. M., Jr. Errors of Interpolation in Instrument Reading and Setting... il 
Levine, A. S. and Tupes, E. C. Postwar Research in Pilot Selection and Classifi- 157 
A ont hah be 7h SS Na i aa Maas 9 
Levine, J. and Butler, J. Lecture vs. Group Decision in Changing Behavior......- A 
Lincoln, R. S. and Smith, K. U. Visual Tracking: II. Effects of Brightness and y1 
Hee nt E E a 15 
Littleton, I. T Prediction in Auto Trade Courses, Ate eea Aia 333 
Long, W. F. A Job Preference Survey for Industrial Applicants W h P e 225 
Maloney, P. W, Reading Ease Scores for File’s “How Supervise??.. 000... 410 
Manolakes, G. The Effects of Tachistoscopic Training in an Adult Reading Program 
McCormick, E J. and Bachus, J. A. Paired Comparison Ratings: 1. The Effect on 123 
Ratings of Reductions in the Numberrofipairs T. r. ek 
McCormick, E. J. and Niven, J.R. The Effect of Varying Intensities of Illumination 93 
Upon Performance on a PET peices ceesusussic..,00,,, i 
McCormick, E. J. and Roberts, W. K. Paired Comparison Ratings: 2. The Reli- 98 
ability of Ratings based on Partial ee ech La Led be, |G : 
Melville, S. D. and Frederiksen, N. Achievement of Freshman Engineering Students 69 
and the Strong Vocational Interest Bi Meee er 1 
pilce o dip inesetisa 9. Validity of a Check List for relents gf 
Millard, K. A. Is How Supervise? an Intelligence Test? | 9110" 221 
paces N. Prediction of Depart 
ata 


Mosel, J..N. and Cosan 1. Gy on 


Accuracy of Fart Ga eee eo? PER 6° 
Norman, R. D. and Redlo, M. M y of Applicat 
Major Groups 


Paterson, D. G, Editor (a ne 


of 
pes. the LD 40! 
P DK g e a A 
read ‘A A d Ba wast s Reading Arabic and Roman Numerals 3h 

A ere ee Peat PS T 

duction Supervisors... Ychological Test Performance of Steel Industry Pro- 34 
Ralph, R. B. and Taylor, C. W. DER a E E a E 
Rock, M. L. Answe € Role of Tests in the 


Contents of Volume 36 v 
Rubin, G., Von Trebra, P., and Smith, K. U. Dimensional Analysis of Motion: III. 

Complexity of Movement. Pattern- « «swiss et ea aa ew gee eee age 272 
Schreiber, R. J., Smith, R. G., Jr., and Harrell, T. W. A Factor Analysis of Em- 

ployee Attitudes eE E = vais oea oaar n a SNe ARa ew 2 eles 247 
Shaw, J., Klausmeier, H. J., Luker, A. H., and Reid, H. T. Changes Occurring in 

Teacher-Pupil Attitudes During a Two-Weeks Guidance Workshop........... 304 
Spragg, S. D. S. and Rock, M. L. Dial Reading Performance as a Function of 

IBS HENOSS sepa deereraesicey aw EE EAn i Aa E e a a A RE ne Coon aE 128 
Spragg, S. D. S. and Rock, M. L. Dial Reading Performance as a Function of Color 

GE umna Nonne hiss 05 86 ba Gm © aAa a aapea ars a arenda O E TE 196 
Strong, E. K., Jr. Nineteen-Year Followup of Engineer Interests................ 65 
Swordes, A. Effect of Changing the Number of Item Responses from Five to Four 

HMHE Samen EES bannt a A a aana a Saye aS REN Se e RRG nad woe a 342 
Thomas, L. L. A Cluster Analysis of Office Operations......................., 238 
Tobolski, F. P. and Kerr, W. A. Predictive Value of the Empathy Test in Auto- 

IQ ile Salesmans hip. cif eta ay se ev eR a bss) 95RE ASR Eiman sku exes 310 
Traphagen, A. L. Interest Patterns and Retention and Rejection of Vocational 

BLN sa Eka sti Mig oases, 9 snes Sag fea uk ag ea Mado oman ee ee acd ee N 182 
Trenchard, K. I. and Crissy, W. J. E. Readability of Advertising and Editorial Copy 

ATU RIMOZANGUNCWSWOCK : -x app x pias «cetanpnstudrsrapes OARS GIR EE SE Ax 4 noah aw atte 161 
Tuckman, J. and Lorge, I. Attitudes Toward Older Workers.................., 149 
Twedt, D. W. A Multiple Factor Analysis of Advertising Readership............ 207 
Valenti, J. J. Measuring Educational Leadership Attitudes.................... 36 
Van Zelst, R. H. Empathy Test Scores of Union Leaders...................... 293 
Verberg, W. A. Vocational Interests of Retired YMCA Secretaries............. 254 
Von Trebra, P. and Smith, K. U. The Dimensional Analysis of Motion: IV. Trans- y 

fer Effects and Direction of Movement.................................... 348 
Wehrkamp, R. and Smith, K. U. Dimensional Analysis of Motion: II. Travel- 

TEV SVC CY Lh CO SE Oe Se oe T ee 201 
Weiss, |. Prediction of Academic Success in Dental School................. 11 
Meera eee L. V. Employment Prognosis of the Post-Poliomyelitic........... 328 

chen O a anlata Eemployment 
Wickert, F. R. How Supervise? Scores Before and After Courses in Psycholog | 
Wickert, F. R. Relation between How Supervise?, Intelligence, aoe fe pe 

a Group of Supervisory Candidates in Industry................ oie 301 
Bi tern CE emt atin tores Yin 
Zachert, V. and Levine, A. S. Education and Prediction of Military Schoo] Seca. oe 
Zuckerman, J. V. Interest Item Response Arrangement as it Aos ae eee a! 

between Professional Groups............................... pee 79 

Book Reviews 
Berdie’s Concepts and Programs of Counseling: John W. Gustad s 
Berrien’s Comments and Cases on Human Relations: Howard P. Mold page: + a 
pers I Did Not Interview the Dead: W. Grant Dahlstrom... 0 U UUU whe 
tyan’s The Public Librarian (With a Section on the Education of Tihvavon. Le 
Cant Sieh)! Errett W. McDiarmid. .....-.---.-00.0... 22.2, 3 ye siei . 428 
C tor’s Learning Through Discussion: Howard P. Mold......._. ~er : 7 foe 277 


attell’. a i i 
Leen conan A Systematic Theoretical and Factual Study: Abraham S. 


vi Contents of Volume 36 


Committee on Undersea Warfare of the National Research Council’s A Survey Report 

on Human Factors in Undersea Warfare: W. A. Co) | ee re ee AE 60 
Dale’s Planning and Developing the Company Organization Structure: C. G. Browne 425 
DiMichael’s Vocational Rehabilitation of the Mentally Retarded: Vivian H. Hewer. 355 


Dooher and Marquis’ Rating Employee and Supervisory Performance: C. E. Jur- 
gensen 


Rati POE SE Re rON E E Eaa 8) rw & NN Dm kes in woe 59 
Eells, Davis, Havighurst, Herrick, and Tyler’s Intelligence and Cultural Differences: 
Joie © Teat eet nnn tet ein a 25+ 445-8 und gedndigecmeenmaneeccs se, 141 
Feingold’s Scholarships, Fellowships, and Loans: Solomon SHANG AN E saa 0 355 
Flesch’s How to Test Readability: Robert L. MESS are sc scecocstian ces Rae A 144 
Freeman and Tayler’s How to Pick Leaders: Theodore R. Bmndboniie se kee ace suck 146 
Fuess’s The College Board: Its First Fifty Years: Arthur E. Traxler............_. 430 
Gouldner’s Studies in Leadership. Leadership and Democratic Action: C. G. Browne. 62 
Guliiksen’s Theory of Mental Tests: Allen L. Edwards....................... a. 145 


Hathaway and Meehl|’s An Atlas for the Clinical Use of the MMPI: George S. Welsh. 279 
Hersey’s Better Foremanship—Key to Profitable Management: Theodore R. Lindbom 431 
Institute for Human ological Counselors: Harold B. 
OE E NE a e AN Ba PI A aaemernsncer std 8 2S 4 ald, p 4 ove awedene em pow 281 
Jalota’s Scientific Personnel Selection Procedure: A Study: Edwin E. Ghiselli...... 358 
GIANG a in eay wean es, 426 
Keys, Brozek, Henschel, 
Tae E a a TE comnts 357 
pinks ine Way to Seontity: A. T: Poffenbeme o oS O TEES t mee meseus 


§ a Lie ee Re ee lm ME Cie ik 280 
Read's Education Through Art: Dale B. Harris 


€ en BATE: UASB, Art esos aaa 5s 279 
Riley, Ryan, and Lifshitz’s The Student Looks at his Teacher: E. 429 
Roger's Client-Centered Therapy: Harold B. PODERI oeni x aaa uy nan eccrine 427 
Rose’s Union Fes S E A nomen ee 354 
Spearman and Jones’ Human Ability: Douglas DUNG og om iramgacins os S 60 

meee Psychosomatics and Suggestion Therapy in Dentistry: William T 

eron 

Bets ot Bp a eens ae bee one we vaciinnmmanmnaee sg, 218 
Toomes Principles of Personality Counseling: Edward S. Oat Go ea. 278 
T ae How to Make Achievement Tests: Harold Ds CARE wich snas aneno 144 
Veer P ueture of Human Abilities: Douglas Irvine.. 146 

ra of He g AE ESETEIT TITT 

Er Social Systeme: Jona Ular Causal and Feedback Mechanisms in Biological 
Welford’s Skill : ee a HWW ate serous separ uewe dll crane 426 
mand Age: an Experimental Approach: Theodore R. Lindbom... 59 


Miscellaneous 


| K 
Journal of Applied Psychology 


VoL. 36, No. 1 


FEBRUARY, 1952 


Reading Abilities of Business Executives 


Carol S. Bellows and Carl H. Rush, Jr. 


Personnel Research Center, Wayne University 


Business executives are exposed to large 
quantities of reading materials every year. 
In 1944 there were 577 commercial and fi- 
nancial’ digests, news letters, and information 
services. These have probably increased in 
the past 6 years. Books published on business 
and economic subjects number about 500 
annually, There is hardly need to mention 
the thousands of press releases, letters, and 
memoranda that find their way to the execu- 
tive’s desk. 

What do business men read? How well do 
they read? After a brief comment in answer 
to the first question, this paper will be con- 
cerned with the second one. Data presented 
are based on the results of courses in reading 
efficiency participated in by 150 executives 
between 1947 and 1950, 


Reading Interests of Executives 


Responses to a questionnaire circulated in 
1949 to more than 20,000 subscribers to the 
Harvard Business Review throw some light on 
what executives ordinarily read (2). Two 

~ newspapers were read by virtually all of the 
executives responding. Almost half of them 
subscribed to the Wall Street Journal. The 
New York Times was reported read by 35 per 
Cent of the respondents; 75 per cent read one or 
more of the trade publications in their own 
field such as Iron A ge or Women’s Wear Daily. 
In addition they were likely to read one or 
more magazines of general business interest 
such as Business Week, Fortune, or Time. 
Fr quently they resorted to digests and reports 
of longer books and articles, The subjects 
Most widely pursued in books included: per- 
sonnel psychology, business management, 
economics, marketing, and accounting. 


The attitude of many was summarized in 
the comment of one of the executives: “There 
just isn’t time to read one-tenth of all I would 
like to read.” One of the possible approaches 
to this problem is the training of executives in 
effective reading skills. With proper training, 
reading speed and comprehension can be in- 
creased resulting in a saving of time or a greater 


coverage of reading materials in the same 
amount of time. 


The Study 


Participants in the Study. Training in silent 
reading skills was given to over 150 executives 
in two industrial plants, two banks, 
department store, and a women’s specialty 
store. Less than 5 per cent of the participants 
were women. Data presented here are based 
on the cases on whom data were sufficier 
complete to warrant Statistical 

The group ranged in age from 22 
None could be classified as seriously deficient 
in any basic reading skill, Illustrative posi- 
tions held by individuals in the group were: 
vice presidents of banks, merchandising man- 
agers, engineering Supervisors, a personnel 
manager, comptroller, and industrial relations 


a large 


atly 
treatment. 
to 65 years. 


manager, Election of the course and atten- 
dance were voluntary. 
Design of the Training Course Meetings 


an skimming,” 
” 

1 erent _ Purposes Etas 

exercise, i.e., one or two of 


“reading for 
and (3) a pacing 
the Harvard Read. 


N 


Carol S. Bellows and Carl H. Rush, Jr. 


Table 1 


Comparison of Measured Reading Performances of 1st with 10th Class Meeting 


Before Training 


After Training Correlation, 


Meeti: 10th Meeti Ist with 10th 
(ist Meeting) (10th Meeting) Critical 
N Mean SD. Mean S.D. r Ratio 
Reading Rate 
(Speed Check: a 
Words per minute) 71 276.6 70.5 439.9 85.2 34 18.05 
Michigan Speed of sf 
Reading Test 62 51.9 9.7 56.2 8.2 Ol 4.14" 
(Forms 1 and 2) 
Nelson-Denny Vocabulary n 
Test 58 61.3 13.9 64.7 13.9 70 2.38 
(Forms A and B) 
Nelson-Denny Paragraph 
Reading Test 58 47.1 11.7 52.4 10.6 75 5.07"* 
(Forms A and B) 
Nelson-Denny Total 58 108.5 22.5 17.6 20 .75 4.32" 
(Forms A and B) 
j pee 


* Significant at the 5% level. 
** Significant at the 1% level. 


ing Films (4) followed by a comprehension 
check. 

Objective tests used (alternate forms for 
initial and final testing) included the Nelson- 
Denny Reading Test and the Michigan Speed 
of Reading Test. The Michigan Vocabulary 
Profile Test was also given so that the parti- 
cipants might gain some insight into their 
particular strengths and weaknesses in voca- 
bulary. Individual test results were made 
available to the participants at both the be- 
ginning and end of the course. 


Results 


Statistical analyses were conducted to evalu- 
ate the course in terms of reading gains as 
measured by standard reading tests. Table 1 
shows a comparison between “before” and 
“after” results on these tests. 

It will be noted that all critical ratios! were 
significant at the 1% level of confidence with 
the exceptidn of the Nelson-Denny Vocabulary 


' The formula used to test the significance of differ- 
ence between before and after means was: 
Du 


Critical Rati Mı— M: 
ic Rio = —— = = 
Dy Vos, + Oar. — rO OM 


Test. These results suggest that statistically 
significant improvements in some reading 
skills took place as a result of the reading 
course. Large gains in reading rate are show" 
Follow-up studies would be of value as a chee 
on the permanency of gains. 1 

Inspection of the dispersion of before a™ 
after measures suggested further analys!§ a 
gains. The reader will note that the pefor Í 
and after dispersions on the standardized test 
show only slight differences. On the Readin 
Rate variable, however, the standard deviatio” 
increased considerably at the end of training: 
This increase in dispersion may be attribute 
to differential effects of the training on varioYî 
individuals. It was found that increase in 
reading speed (words per minute) correlate 
—.32 with initial reading rate. In other words 
there was a slight tendency for the slow® 
readers to gain more in terms of increas® 
speed than those who began the course at 
faster rate. It was also found that incre 
in reading speed correlated —.41 with pe, 
This suggests a reasonable interpretation ta 
the younger trainees tend to gain more F 
the course. 


EEE a 


— 


sj 


4 


Reading Abilities of Business Executives 3 


Table 2 


Percentile Equivalents for Mean Scores on Reading 
Tests Before and After Training * 


Table 4 


Intercorrelation Matrix of Reading Tests Administered 
to 61 Business Executives * 


Percentile Percentile 
Test Before After 
Michigan Speed of Reading 26 45 
Nelson-Denny Vocabulary 75 82 
Nelson-Denny Paragraph Reading 57 70 
Nelson-Denny Total 70 82 


* Compared with college seniors. 


In order to make the raw score results more 
meaningful, “before” and “after” means were 
converted to percentile equivalents in Table 2. 
In the absence of norms on business popula- 
tions, college senior norms were used. . 

During each meeting of the course the parti- 
cipants were given a timed speed check on one 
of a series of standard 1,000 word reading 
exercises. Each person kept an individual 
graph of the words read per minute in these 
speed checks. Results on these reading exer- 
cises are shown in Table 3 for the 71 parti- 
cipants present at all meetings. 

Table 4 shows the intercorrelations (Pearson 
Product-Moment) between the various meas- 
ures of reading skill. Two separate matrices 
are given to indicate the intercorrelations 
between the tests before training and after 
training. 

In general, the intercorrelations tend to in- 
crease in the final testing. The low correla- 


Table 3 


Average Weekly Speed of Reading for 71 Business 
Executives during a Ten-Week 
Reading Course 


Standard 

_ Meeting Mean Deviation 
1 276.6 70.5 
2 290.0 69.9 
3 338.6 57.9 
4 369.9 67.9 
5 387.5 81.0 
6 396.7 75.4 
7 388.6 87.5 
8 415.7 81.2 
9 425.5 90.2 
10 440.0 85.2 


Reading Rate 
(Speed Check) = 89 21 85 20 
Michigan Speed : 
of Reading 19 — 416 35 28 
3 Nelson-Denny 

Vocabulary 09 56 — 61 90 
+ Nelson-Denny Para- 

graph Reading 13 4.50 53 — gi 
5 Nelson-Denny 

Total Score 14 60 85 Js — 


w 


* Coefücients on the left of the diagonal are the inter- 
correlations between tests administered during the first 
meeting of the course. Coeficients on the right of the 
diagonal are the intercorrelations for alternate forms of 
the same tests administered during the tenth meeting. 


tions between Reading Rate and the Michigan 
Speed of Reading Test would suggest that 


different aspects of reading speed are being 
measured by the two tests. 


Discussion 


At the beginning of training the executive 
group averaged approximately 275 words per 
minute. They scored on the Michigan Speed 
of Reading Test at the 26th percentile for 
college seniors. This suggests that the group 
was initially rather poor in speed of reading. 
Vocabulary, on the other hand, as measured 
by the Nelson-Denny Test was at the 75th 
percentile for college seniors at the beginning 
of the course. Gains in rate were greater and 
more significant than vocabulary gains as 


It might be noted that 
, or 71.8 per cent of the 
ow 300 words per min- 


ute. At the end of training only 1, or 1.4 per 


cent of the group read below 300 words per 
minute. 

Vocabulary training probably requires a 
great deal more time and effort than was 
possible in this course. The gains in voca- 
bulary reported in Table 1 and 2 may only 
reflect the Participants’ increased speed in 


reading the test items. A control group might 
have shown as much gain. 


The causes of slow reading among these com- 


petent adults are probably complex, but some 
of them can be suggested: 


1. Carry-over from early childhood of oral 
reading habits, word-for-word rather than 
phrase reading: “hearing? each word when 
reading silently rather than apprehending con- 
nected phrases directly. 

2. Over-cautious approach to printed matter 
because of fear of losing “something important”: 
there is a tendency to note carefully every word 
even when the material does not warrant such 
close scrutiny. Many executives apparently 
had never developed the useful arts of selective 
reading or skimming. The use of key words 
in extracting the most meaning with the least 
effort was apparently not a part of the basic 
reading equipment of most of the executive 
group. 

3. Difficulty in concentration and remember- 
ing: the length of time required to put words 
together into meaningful ideas is perhaps often 
so arduous and time-consuming that material 
presented in the first part of a paragraph is 
“forgotten” before the end of the paragraph 
is reached. Under these circumstances it is 
hard for an individual 
to concentrate. 

4. Persistence of reading patterns related to 
particular job duties: engineers, accountants 
and Similarly trained executives are likely to 
persist in the habit of giving close attention to 
each printed symb 


ol. Thi -i i 
the individual symbol a Apen dr 


Q is carried over to more 
general reading in books and newspapers, 


to maintain interest, 


Carol S. Bellows and Carl H. Rush, Jr. 


Findings of other workers in the area of 
adult reading are similar to those reported here- 
Broxson (1) and Buswell (3) have found that 
the average adult tends to read more slowly 
than is necessary and also tends to read all 
types of material at about the same rate with- 
out regard for level of difficulty. Most i- 
vestigators have found their samples of adults 
averaging about 300 words per minute. 


Summary 


A course in reading efficiency was conducted 
in the Detroit area for several groups of bust- 
ness and industrial executives. Results of 
standard reading tests administered at the 
beginning and end of the course showed slight 
but statistically significant gains in rate ° 
reading and accuracy of comprehension. rhe 
before and after gains in reading rate on non- 
test materials, without checks on compre” 
hension, were quite large. The findings we 
gest that business executives can significant} 
increase their reading rate on practice exercises 
Received March 2, 1951. 


References 


ly 
1. Broxson, J. A. Teaching adults to read. Peabo) 
J. Educ., 1943, 20, 166-172. 


at 
2. Bursk, E. C., and Clark, D. T. Reading habits 


949 
business executives. Harvard Bus. Revs 19°" 
27, 330-340, pe 
3. Buswell, G. T. How adults read. 


Chicag?: ago, 
partment of Education, University of Chic? 
1937. 


Uni- 
4. Harvard reading films. Cambridge: Harvard 
versity Press, 1948, 


Validation of a Correspondence Aptitude Test 


Philip H. Kriedt 


Prudential Insurance Company, Newark, New Jersey 


In the validation of industrial tests, it is fre- 
quently difficult to find a single criterion 
which is completely satisfactory. Sometimes 
it seems advisable to use a number of criteria 
in evaluating a test’s effectiveness. (This is 
especially true if the validation study involves 
a fairly small sample of individuals and if the 
key for scoring the test is being developed 
empirically using this sample. The following 
report illustrates a validation study of this 
nature in which a variety of criteria were used. 

In order to develop an aptitude test battery 
to predict success in the job of correspondence 
clerk, a study was made of 200 such clerks in 
the Home Office of The Prudential Insurance 
Company. These clerks conduct correspon- 
dence with policyholders, district managers, 
and others. Although their duties are roughly 
similar, their assignments vary considerably 
with respect to job levels as assigned by Job 
Evaluation. Correspondence clerks with high 
job levels conduct correspondence regarding 
complex problems while clerks on lower levels 
handle more routine correspondence. 

In developing a selection battery for this 
job, it seemed desirable to attempt the con- 
struction of a test which would measure in a 
fairly direct manner aptitude for writing clear 
and tactful business letters. A preliminary 
form of such a test was made by describing 
48 business situations requiring a letter to be 
written. For each situation three short para- 
graphs differing in clarity, brevity, and tone 
were prepared that might have been included 
in such a letter. Individuals taking the test 
were asked to select the paragraph that they 
thought most people would prefer to receive 
and also the paragraph they thought most 
people would least like to receive. 


Selection of Criteria 
A combination of the following three differ- 


ent criteria was used in keying this test: (1) 
Ratings of clerks by supervisors; (2) Job 


level as an indirect measure of ability of clerks; 
and (3) Preferences of a group of employees 
selected to be similar in their distribution of 
age, sex, and education to policyholders and 
others who commonly receive Company corre- 
spondence. These employees were asked to 
indicate the paragraphs they themselves would 
prefer to receive and least like to receive. It 
was not practicable to sample the preferences 
of persons who actually correspond with the 
Company. 

None of these criteria appeared to be ade- 
quate as a single criterion. Supervisory 
ratings probably reflect quite accurately the 
ability of an individual to give correct informa- 
tion in answer to questions and the ability to 
conduct a desirable amount of correspondence, 
but it is not at all certain that supervisors 
themselves know what kinds of letters are 
considered clear and tactful by those who 
receive them; consequently, supervisors may 
not have evaluated this ability properly in 
making their ratings. Job level 
affected by sex (most women being in lower 
level jobs) and length of experience as well as 
by ability. The employees selected to be re- 
presentative of policyholders and others are 


probably not a completely adequate sample 
to represent this group, 


is greatly 


Development of the Key 


; The 200 correspondence clerks were divided 
into two groups of 100 each, matched for 
Supervisory rating, sex, and job level. One 
group was used for developing a key and the 
other group for cross-validation purposes. 
The group on which the key was developed was 
divided into the high third and low third based 
on an arbitrary combination of supervisory 
rating and job level, 

A key was developed by two successive 
Tesponse analyses. First, responses were se- 
lected which showed at least a 12 per cent 
difference in the frequency with which they 


6 


were chosen by the high and low criterion 
groups. The 12 per cent figure was an arbi- 
trary one suggested by several empirical studies 
on the development of interest questionnaire 
keys. The writer knows of no practicable 
non-arbitrary method of selecting responses 
for such a key. 

Next, each response which met this first 
requirement had to meet a second requirement. 
The responses of the high ability group had to 
be more like the responses of the “policy- 
holder” group than were the responses of the 
low ability clerks. For instance, if Paragraph 
A were ranked third by at least 12 per cent 
more high ability clerks than low ability clerks, 
then the rank of 3 had to be the most frequent 


response for the “policyholder” group in order 
for this response to be keyed, 


Eighty-eight of the 172 responses which 
met the first requirement also met the second 
requirement, and these 88 responses, scored 


either +1 or —1, make up the key for the test. 
Since only half the Tesponses which passed 
requirement 


1 also passed requirement 2, it 
would seem that the key was developed on 
two quite unrelated criteria: a combined mea- 
Sure of supervisory rating and job level, and 
the preferences of a group representative of 
policyholders, 


Estimates of Validity 
It seemed desirable to estimate the 


s l validity 
this key in a number of different ways, 


of 


The folloy 
correlations wi eae Product 
ratings, 38 


} corresponde: 
the cross-validation 3 = 


ls significant at the .05 level, 


Philip H. Kriedt 


3. Eight research technicians in the PEF- 
sonnel Research Division independently made 
an analysis of the scoring key to discover what 
kinds of paragraphs seem to be favored 2 
penalized by the key. This analysis mi 
that the key, although it appears to be a fairly 
subtle one, strongly favors the following three 
types of replies: 


a. Cordial and friendly paragraphs are pre- 
ye a Nar: iri- 
ferable to cool and disinterested ones. For in 
stance: * 


“Thank you for the interest you have es 
pressed in our recent ‘campaign eon 
savings accounts. Unfortunately, howev y 
our small staff is just plain ‘too busy’ nt per 
together the materials you ask for. Per naps 
we will be able to help you later on. 


is better than 


sats like 
“T hope we can take care of requests Na 
yours at some future time. At presel 
however, we are much too busy to do 50. 


. epforable 
b. Customer-centered replies are preferé 
lo Company-centered ones. For inslance: 
cing for 
“I am sure you are constantly looking 
ways to improve the product you sell. 


is better than 


, appoint- 
“I would very much like to make an apr an- 
ment to talk to you about the many ad 
tages of our flour.” 


Bg ] 5 „plies 
c. Positive replies are preferable to repl s 
which 


: 4 zor instant” 
have a negative emphasis. For insta 


‘S 
“We have been so rushed in recent wel 
that our service to customers has necessa jj] 
suffered. You may be sure that We that 
Soon get back to our old schedule and rice 
you will be able to count on prompt set’ 
once again.” 


is better than 
« d it 
I am very sorry that you have foun our 
necessary to make a complaint about fat 
Service, owever, I am sure your comp" in 
1S Justified as we have not been able to may; 
tain our usual standards of service lately: 


merely 
than pl on the customer, and are 
awkward and stereotyped, 


Validation of a Correspondence A plitude Test 7 


This content analysis of the key indicated 
that the test is in general agreement with 
frequently accepted principles of good letter 
writing. 

Summary 


In general, it seems that a variety of in- 
dependent methods of evaluation all indicate 
that this test probably has a moderate rela- 
tionship with the ability to write clear and 
tactful correspondence and also with success 
as a correspondence clerk. The test is related 
to supervisory ratings of correspondence ability 


and to job level for a group of clerks not used 
in developing the key. The key, by virtue of 
the manner in which it was constructed, is 
related to the personal preferences of a group 
of employees selected so as to be representative 
of policyholders and others who receive Com- 
pany correspondence. A small group of corres- 
pondence experts score significantly higher 
than correspondence clerks on the test. 
Finally, a content analysis of the key indicates 
that it is in agreement with common principles 
of good letter writing. 


Received A pril 23, 1951, 


* 
Prediction of Department Store Sales Performance from Personal Data 


James N. Mosel 
The George Washington University 


While extensively used in other fields of 
selling, the personal data blank has received 
little attention in the selection of department 


store sales personnel. Early studies of a few 
characteristics were m 


exploratory. More recently, Stead (3) found 
that of 10 different 


ith a composite 
Similarly, Stead, 
a personal data 


satisfactorily 
terion of sales 


d. A system of di 
inst several criteri 
sis for the apprai: 
eatly increase th 


* The writer wishes to express hi i 
age XP IS gratitude to Mr. 
ae nee for Collection of the data and his part 


Procedures 


et Tee: mer 
Criterion. As a criterion of sales p 


formance, “selling cost per cent” was E 
This index is computed for each employee 7 
dividing total selling cost (salary and egami 
sions) by total net sales (dollar value of ac ie 
sales).! These ratings are essentially a aa y 
ure of the dollar worth of an employee to ne 
company, and are computed anne 
by departments. The index has been bog 
out by the store in conjunction with the WA 
tional Retail Dry Goods Association relate 
furnishes member stores with typical ae 
Cost per cents for similar stores and goals 
each department. iling 

As a measure of job performance, se è 
Cost per cent contains an obvious sour va 
of contamination, Interdepartmental a 
ences in the price of goods and consumer ity 
mand create inequalities in the Scrape aie 
to earn net sales. Thus an employee’s saling 
cost would to some extent be a function of th 
department to which she was assigned. he 
further source of Contamination lies in Si 
fact that there was a relationship beme 
time on the job and selling cost. Proyo 
analysis had shown that up to about m 
months there is a Steady average haere ý 
selling cost, While the company adjusts 1° 
these factors 


. o Se : m- 
“clinically” in evaluating € dy 
Ployees, it was necessary in the present stu 


A a st 
to partial out these biases before selling e 
per cent could be used for criterion purpos¢ 


. . 1 
his was achieved Operationally by the metho’ 
of selecting the criterion groups as describe 

below. i 


2 e- 
additional controls ž 


‘For more det: 
see 4, pp. 80-81. 

2x aen used by the co! 
decisions, these inequaliti 
tivity. our Managem, 


+ S, 
ailed explanation of these measures» 


el 
mpany to make persona e 
es are adjusted for subies 
ent representatives equate 


Prediction of Sales Performance from Personal Data 9 


It seemed desirable that predictions be made 
in terms of it. 

In this connection it is interesting to note 
that Stead, Shartle et al. (4) report a reli- 
ability of .88 for the selling cost criterion, 
based on the correlation between the first and 
and second half of a year’s sales data. When 
corrected by the Spearman-Brown formula 
this yielded an estimated reliability for one 
year’s data of .94. They also found selling 
cost to correlate —.15 with supervisors’ rat- 
ings, indicating clearly that the two measures 
are reflecting different aspects of department 
store selling. The negative sign in this coeffi- 
cient does not mean that sales performance 
and ratings are negatively related, but simply 
reflects the nature of the selling cost per cent 
ratio: the higher the selling cost, the poorer 
the performance. 

Criterion Groups. For purposes of item 
analysis it was desired to obtain an upper and 
lower criterion group representing the extremes 
in sales performance. These contrasting 
groups were secured by consulting the com- 
pany’s application files for the 1948 fiscal year 
and selecting from each of the 85 departments 
the sales clerk who had subsequently achieved 
the highest and the sales clerk who had 
achieved the lowest selling cost per cent. In 
this way it was possible to obtain an upper 
and lower criterion group, each, containing 
85 sales women, and roughly equated on inter- 
departmental differences in selling opportunity. 
This procedure had a further advantage in that 
it insured representation of all departments, 
a feature which was highly desirable since it 
was important to produce an instrument 
having applicability to all departments. 
Furthermore, only those employees who had 
been on the job for at least six months were 
chosen for study, thus reducing the effect of 
experience. 

But despite these controls, the criterion 
8Toups were probably not so contrasting as 
they might appear. There was some evidence 
of a relationship between department and 
Caliber of personnel. Through transfers and 
Placements the better sales persons may have 
tended to gravitate toward the more critical 
~ 


sales y 
tions 
validi 


olume and make allowances for seasonal fluctua- 
and department volume. The reliability and 
ty of such adjustments are not known. 


departments, leaving the less capable to ac- 
cumulate in others. Consequently, the best 
employee in an inferior department might 
actually be mediocre on an absolute basis, 
while the low employee in a superior depart- 
ment might actually be quite good. If this 
effect was in force, and management con- 
tended that it was, there would be a displace- 
ment of both groups toward the middle and a 
consequent reduction in their discriminability. 
This would impose a restriction upon the dis- 


‘crimination power of the personal data when 


subjected to item analysis. Results, however, 
would be strengthened by this handicap, since 
any differentiations obtained would be under- 
estimates. 

Item Analysis. The Chi-square test was 
applied to the category frequencies of each 
item to determine whether the high and low 
selling cost responses could represent a homo- 
geneous population. Of the 42 items of in- 
formation submitted by the applicant at the 
time of employment, 12 proved to distinguish 
between the two groups at the .05 confidence 
level. These were: age, years of formal educa- 
tion, years of previous selling experience, ` 
weight, height, time on last job, time on next 
to last job, domicile, type of principal experi- 
ence, number of dependents, marital status, 
and time lost on job in last two years. 

The response categories for each item were 
assigned weights by the “vertical per cent 
method” (4, pp. 253-5). This method weights 
each category according to the difference in 
per cent of high and low Selling cost employees 
making the response. The per cent differences 
were reduced to simple integral weights by 
Strong’s Table of Net Weights (5), and were 


simplified further by adding a constant to 
eliminate negative signs. 


Results 
From the res 


; \ sponse categories characteristic 
of low selling 


3 Cost employees the following 
composite description emerges of the “ideal” 


low selling Cost sales woman (in order of dis- 
crimination): between 35 and 54 years of age, 
13 to 16 years formal education, over five years 
previous selling experience, over 160 pounds, 
five years or less on next to last job, lives in 
boarding house, over five years on last job, 


10 


minor executive as principal previous experi- 
ence, between 59 and 62 inches in height, one 
to three dependents, widowed, and no lost 
time in last two years. 

To obtain a cross-validation check on the 
scoring key, total scores were computed for 
another sample of 100 present employees. 
Half of this group was drawn from low selling 
cost sales clerks in various departments; the 
other half from the high selling cost employees. 
Table 1 shows the distribution of personal data 
scores for the two groups. There is appreci- 
able discrimination between the two groups as 
evidenced by the clustering of the scores at 
different parts of the score range. The differ- 
ence between the mean scores of the two groups 


is highly significant, the critical ratio being 
5.69. 


Table 2 shows the per cent of high and low 


ees scoring at and above 


employment, 68 of the 
have been selected, of 


Table 1 
Distribution of Total Personal Data Scores of 50 Low 
and 50 High Selling Cost Women Sales Clerks 
Personal Data Low Selling High Selling 
cores Cost Group Cost Group Total 
50-54 1 1 
45-49 al 1 
40-44 13 3 16 
35-39 13 11 24 
30-34 18 8 26 
25-29 3 11 14 
20-24 1 11 12 
15-19 3 3 
10-14 3 3 
Total 50 50 
1 
Mean 36.1 28.3 n 
S.D. 5.6 7.9 


James N. Mosel 


Table 2 


` sales Clerks 
Per Cent of Low and High Selling Cost S 
Scoring at and above Various C utting Sgor ' 
and Per Cent Selected at Each Cutting 
Score Who are of Low Selling Cost 


Per Cent 
PerCent Per Cent Sea 
Number Low High af Low 
Accepted Selling Selling Selling 
Cutting (Total Cost Cost Cost 
Score Group) Accepted 7 i 
3 a 2 
45 2 4 0 s 
40 18 30 6 36 
35 42 56 28 68 
30 68 92 H 60 
25 82 98 66 s 
20 94 100 88 31 
15 97 100 94 i 
10 100 100 100 
= f 
i a o be 0 
which 68 per cent would have proved t 


low selling cost. 


Summary 


Analysis of the application blanks oe 
women department store sales clerks 2 , dis- 
that 12 personal data items significant y a 
tinguished between high and low penak em- 
employees. When applied to 100 pr ese scores 
ployees, total weighted personal data selling 
showed a substantial relationship tO 
performance. t other 

These results are in accordance wit sonnel 
findings on department store sales peri rel- 
ie., Personality, personal situation N€ | 


«ators O 
evant experience are useful predictor: 
success, 


170 


Received March 9, 1951. 


References New 
1. Anderson, V, y, Psychiatry in industry: i 
York: Harper, 1929, ajusimen 
2. Kitson, H. D. Psychology of vocational adj 
Philadelphia: J. P, Lippincott, 1925. lesperso”” 
3. Stead, W. H, The department store sa I 
Occupations, 1937, 15, 513-515. pation” 
4. Stead, W. H., Shartle, C. L., etal. Oco perica? 
counseling techniques. New York: ) 
Book Company, 1940. " ersonn? 
5. Strong, E, K., Jr. An interest test for 34-203- 
managers. J. Person, Res., 1926, 5, 1 


Prediction of Academic Success in Dental School 


Irving Weiss 


University of Kansas City 


For the past four years the School of Den- 
tistry at the University of Kansas City has 
participated in a nation-wide testing program 
conducted by the Council on Dental Educa- 
tion of the American Dental Association (8). 
One aspect of this testing program is an at- 
tempt to predict success or failure of dental 
students as measured by their grade point 
average in dental school. Since 1946 every 
Freshman class has been given a battery of 
tests shortly after the beginning of the Fall 
semester. 


Test Battery 


The tests in this battery follow the two 
major divisions of the dental curriculum into 
theory and technic work. Theory courses 
cover the basic scientific and theoretical 
groundwork of dentistry, while technic courses 
include the manual, mechanical and clinical 
aspects. Courses in anatomy, oral pathology, 
and orthodontics are indicative of the former, 
while prosthetics, crown and bridge, operative 
and clinical dentistry are representative of the 
latter. Correlations between theory and tech- 
nic grade point averages have been generally 
found to be sufficiently low to encourage the 
development of separate indices for their pre- 
diction. Bellows (1) found a correlation of 
.38 between theory and technic GPA, while 
Wagner (11) at the University of Pittsburg 
found this correlation to be .21. This correla- 
tion was found to range from .30 to .51 for the 
three classes in the present study. 

For the class of 1946, this correlation was 
30, .51, and .41, respectively, for their Fresh- 
man, Sophomore, and Junior years while for 
the class of 1947, their correlation was .34 and 
“47 for their Freshman and Sophomore years, 
respectively. The Freshman class of 1948 had 
à correlation of .33 between theory and tech- 
nic GPA, 


The test battery as of 1949 consisted of the 
following: 


11 


For the prediction of theory grades. 


1. An intelligence test at the college level, 
divided into a quantitative (Q) and a linguistic 
(L) section. Reliability is greater than .9. 

2. A science test prepared especially for the 
Council on Dental Education is an achieve- 
ment test in which the subtest scores are bio- 
logy, chemistry, physics, factual information, 
application of principles, and total. These 
last three scores are combinations of the first 
three. Reliability data have not yet been 
released. i 

3. A test on the interpretation of reading 
materials in the natural sciences. This is 
approximately half of a college level test. No 
reliability data are available. 


For the prediction of technic grades. 


1. Paper and pencil test of object visual- 
ization in three dimensions. Reliability by 
the split half method is reported to be .91. 

2. A carving dexterity test consisting of a 
simple drawing which the student attempts 
to duplicate in chalk, using a carving tool and 
a ruler. The accuracy of the carving is judged 
by a board in Chicago. The design is altered 


yearly. No data are available concerning 
test reliability. 


Pre-Dental College Achievement 


In addition to tests as predictors of dental 


grades, another predictive factor should be 
considered, namely, a measure of college 
achievement as indicated by pre-dental grade 
point average. As indicated by previous in- 


Wagner (11) fou 
for similar value 


12 


Harris (5), and Graves (4) have found cor- 
relations between dental grade point average 
(theory and technic combined) and pre-dental 
grade point average ranging from .34 to 51. 
As indicated in Table 1, the results obtained 
in this study are in general agreement with the 
above findings. Correlation between theory 
and pre-dental GPA range from .40 to .54, and 
between technic and pre-dental GPA from .12 
to .21. Itis interesting to note that the total 
pre-dental grade point average and the re- 
quired science grade point average are equi- 
potent predictors of dental school achievement 
for the 1946 and 1948 classes, and that for 
these two classes, differences between their 
correlation with dental theory could be as- 
Ctuations, 


biology.) However 
relations of .48 betw 


e pre-dental grade point 
As only ten to 


neous grading systems 
uld not be easily made, 


were available on igi i 

i git normal 
scores which ranged from — + i 
thirty-nine Participating 
the local data seldom ra 


Irving Weiss 


tervals, Sheppard’s correction for broad ee 
ing was employed in calculating the firs rfl 
correlations. The multiple correlation a 
cient was calculated by the Wherry-Doo 
method of test selection (10). itive 

These results indicate a moderate Pean 
relationship between dental school a the 
grades and the predictor variables. coin 
prediction of theory grade point apr iN es 
form of pre-dental grade point averag fruit- 
Sections of the science test are the yale 
ful. Except for the class of 1947, gener 
in the correlation between the theory net 
and the pre-dental science grade point ay = 
as contrasted with the correlations he ntal 
the same criterion and the total meee, 
grade point average, can be ascribed tø m bid. 
errors in sampling. The chemistry seat y 
logy sections of the Science test Spm coeffi- 
increase one of the multiple correlation ‘ith 
cients. However, the higher sang the 
the theory criterion of these see er sec 
Science test as compared with the ot “chance 
tions, may in some cases be due to f any 
errors in sampling. The addition "errors 
further tests either adds more chance ry; 2 
than actual validity to the test iver) cor 
does not increase the size of the a . 
relation coefficient by any significant si with 
The low correlation of Q and L seo vagnel 
theory grades has also been found by porte 
(11). However, Peterson (9) has E ið. 
this correlation to be .56 for one denta e bee? 
Tt was hoped that this test could a nol 
used as a suppressor variable, but this show 
Prove to be the case. The reading ee poin 
a moderäte correlation with theory gra Cjation 
average, as well as a rather high s the 
with pre-dental grade point average 2 
chemistry section of the science test: a 
result of this correlation between ha ition 
test and grade point average, the a: not 
the reading test to the test battery ultiple 
increase the size of the shrunken ™ 
correlation coefficient. 


Technic Criteria teria 
so cr 
The correlations between technic, test 
and the carving and object agement = en 
Were significantly lower than those corres” 
the theory criteria and several of the 


i 3 
Prediction of Academic Success in Dental School i 


Table 1 


i vi A) with Dental Theory 
f i -dental Grade Point Average (GP. 
ERE m e Grade Point Average (GPA) 


Class of 1946 


Class of 1947 Class of 1948 


N = 106 San Tiha GOAT 
‘funioe GPAt Sophomore GPA teshman GPs 
Theory Technic Theory (a) Technic = Theory (b) Technic 

zag A 54 23 40 -20 

Pre-dental GPA (1) Me = 43 16 Ad 2 
Pre-dental Science (2) 45 ae 

GPA (required) = x 34 .20 .32 -00 

Science Test Bio. (3) s é 47 .12 AT 14 

Chem. (4) $ * .28 .26 AS .09 

Phy. x * 40 .23 46 10 

Fact. à * 40 31 228 02 

Appl. š * Al 28 36 A 

Tot. i * 46 .20 40 a3 

Reading Test _ 1S 00 00 03 

(Q = ae 31 05 8 10 

Intelligence (L ei z B "29 04 10 04 

T p ‘02 2 09 24 24 

Obj. Vis. 3 2 24 2 34 16 235 

Carving Test A a 


Rais = -63; Ro. = 52; and Rasa = .56. 


* Not given in 1946. 


t Grade Point Average for any year is based on grades earned in that year only, 


ponding theory tests. In the junior and 
Sophomore years in Table 1, the correlation 
figures for the object visualization test could 
be from populations whose parameter values 
are zero. Wagner (11) reports correlations of 
-14 and .22 with technic grades for two fresh- 
man classes (N is approximately 100 in each 
class). In the present study the carving test 
gave a uniformly moderate correlation with 
technic criteria ranging from .24, to 35, 
Wagner (11) found for the above mentioned 
classes correlations of .30 and .43. 


Table 2 


* Intercorrelation between Yearly Theory Grade Paint 
Averages and Intercorrelation between > early 
Technic Grade Point Averages * 


Class of 1946 Theory Technic 
Junior vs, Sophomore 81 35 
Junior vs, Freshman 68 25 

Class of 1947 
Sophomore vs. Freshman whe 47 


* Grade point a 


verages for any year are based on 
grades earned in t 7 


hat year only. 


The results of Table 2 show a fairly high 
relationship between the various years of 
theory work. A prediction based on one 
year’s work in theory is likely to hold for the 
remaining three years of theory work. How- 
ever, the correlation between technic grade 
point averages for the various years is signifi- 
cantly less than similar intercorrelations based 
on theory grades with the result that pre- 
diction of technic grades based on one year of 
such work does not appear to be too reliable. 


This may be due to the greater heterogeneity 
in the technic curriculum. 


Screening of Dental Applicants 


The use of the test batter 
grades for the selection of $ 
dents presents sever: 
to the magnitude of t 


y and pre-dental 
uture dental stu- 
al problems in addition 
he correlation coefficients. 
By the analysis of variance techniques, the 
variation in the means of the test scores from 
year to year is such that the three or four 
dental classes (test results but not grades were 
available for the freshman class of 1949) can- 
not be considered samples from the same popu- 


14 Irving Weiss 


lation. Moreover, a comparison of similar 
intercorrelation coefficients among the several 
independent variables’ for 1947 and 1948 in- 
dicated that the differences could hardly be 
attributed to random sampling errors. The 
differences among the multiple correlation 
coefficients are also Statistically significant. 
As a result, the assignment of relative weights 
to the independent variables for the selection 
of future dental students appears at the pre- 
sent time to be a rather approximate arrange- 
ment. 

In addition to the requirement of linearity 
of regression, the distribution of the criteria 
and the predictor variables should be at least 
approximately normal for predictive purposes. 
All the variables fulfill the requirements of 
linearity and normality with the exception of 
the technic criteria, which is linearly but not 
normally distributed. This lack of normality 
makes the prediction of technic work from any 


variable decidedly tenuous for the data in this 
study. 


and test scores, a 
distributed to 20 
each class, Thi 
student to check o 


measuring motivation, It is felt that the 
administration of these t i 


admission to dental school could conceivably 
be an additional factor which cannot be ac- 
counted for at the present time. 


Summary 


The present study has found that a me 
ation of pre-dental grades and some ss 
of the science test will give moderately nea 
correlations with dental theory grades. I 
used in the selection of dental school appli 
cants, of whom 20 to 40 per cent are selecte à 
these correlations will help in eliminating 
potentially poor theory students. The non- 
normal distribution of technic grades in this 


E a +. the 
study prevents at present a similar use in tl 
technic field. 


Received April 2, 1051, 


References 


1. Bellows, R. M. The status of selection and at 
seling techniques for dental students, J. consull- 
Psychol., 1940, 4, 10-16, g lastic 

2. Douglas, H. R. Factors associated with scho ‘hae 
success in the school of dentistry at the ger 
sity of Minnesota. Prac. Amer. Ass. Dent. Sth» 
1938, 15, 172-179, on 

3. Freeman, H. J., and Smith, R. V. A report 
aptitude testing in dentistry at the Unive 
Towa. Proc. A mer, Ass. Dent. Sch., 1935, 1 
214-228, JI 

4. Graves, H. A, Factors in dental aptitude. 03. 
Proc. Amer. Ass. Dent. Sch., 1943, 20, 301-3 N 

- Harris, A. M. The relative significance of y 

ures of mechanical aptitude, intelligence, at 
previous scholarship for predicting achievem 
in dental school. J. appl. Psych, 1937, %5 
513-521. in 

6. McGrath, E. J. Predictive values of grades ‘ih 
various types of predental courses. J. Dé 
Educ., 1942, 7, 46-52. rork: 

7. McNemar, Q. Psychological statistics. New Yo 
John Wiley and Sons, 1949, sA 

8. Peterson, S, Dental aptitude testing Dee oil 
report of progress. J. Amer. Dent. Ass., 1941 
35, 175-184, an 

cA Peterson, S, Forecasting the success of Se: 
dental students through the aptitude Gee 
program. J. Amer. Dent, Ass., 1948, 37, 25 

5 


10. Stead, W. H., and Shartle, C. L., etal. Occupa- 


tional counseling techniques, Chicago: America” 
Book Company, 1940, 


11. Wagner, R, F, A study of the critical requirements 
nti: 


for dentists. Unpublished Ph Dy thesis, Univer- 
sity of Pittsburgh, 1949. 


an 


Prediction in Auto 


Trade Courses * 


Isaac T. Littleton 
Chapel Hill, North Carolina 


Within the past 30 years interest in me- 
chanical aptitude and its measurement has in- 
creased tremendously. Psychologists have 
attempted to determine the factors that under- 
lie this aptitude, developed tests to measure 
some of them, and validated many of these 
tests for specific mechanical occupations. 

There has been little attempt, however, to 
compare the validities of tests which are de- 
signed to measure similar traits and to compare 
standard single tests with the subtests of 
mechanical aptitude batteries. 

This study was undertaken to compare the 
validities of two similar mechanical aptitude 
batteries with that of a combination of four 
selected single tests. Secondary aims were: 
(1) to compare the effectiveness of subtests 
and single tests, and their relative weights in 
the total batteries; and (2) to determine the 
values of the tests and batteries in predicting 
Success in training in courses in Auto Me- 
chanics and Auto Body Repair and Painting in 
a technical trade school. 


The Tests 
The aptitude batteries were chosen on the 
basis of two criteria: (1) their suitability for 
the trade school students; and (2) the apparent 
Similarity in factorial content of their subtests. 
The two batteries chosen were: 


1. The SRA Mechanical Aptitudes, Form 
AH, (7) which has three subtests: Mechanical 


Knowledge; Space Relations; and Shop Arith- 
metic, 


. 2. The California Prognostic Test of Mechan- 
‘cal Abilities, Form A, (8) which has five 
Subtests: (1) Arithmetic Computation; (2) 


fading Simple Drawings and Blueprints; 
(3) idee and Use of Tools; (4) Spa- 
tial Relationships; and (5) Checking Measure- 
ments with a Ruler. Tests 1, 2 and 5 contain 
ane similar to the SRA Shop ‘aie 
ate „the > Jechanic 
— Test 3 is similar to the Me 

5 b38ed on data from a thesis submitted to the Com- 
mittee on Graduate Studies of the University of 
the esse in partial fulfillment of the requirements for 

degree of Master of Arts. 


15 


Knowledge subtest of the SRA battery, and 
test 4 is similar to the SRA Space Relations 
subtest. 

Two criteria were used also in selecting the 
single tests: (1) they should measure about 
the same abilities or factors which the subtests 
in the batteries measure; and (2) they should 
be widely used standard tests with known 
validities for mechanical occupations. The 
following tests were chosen: 

1. The Bennett Test of Mechanical Compre- 
hension, Form AA (1). This test was used to 
get at the same component as the Mechanical 
Knowledge subtest of the SRA battery and the 
tool usage test of the Prognostic Test of Mechan- 
tcal Abilities. 

2. The Revised Minnesota Form Board, Series 
MA (5), was selected because it is similar 
to the spatial relations subtests in the two 
batteries. 

3. The Purdue. Industrial Training Classifi- 
cation Test, Form A (4). It has items similar 
to those of the Shop Arithmetic subtest of the 
SRA battery and to those of subtests 1,2and5 
of the Prognostic Test of Mechanical Abilities. 

4. The O'Rourke Survey Test of Vocabulary, 
Form X4 (6). This test was included to pro- 
vide a check on the verbal intelligence of the 
subjects and to determine how much a vocabu- 
lary test would add to the predictive power of 
the battery. 

The tests were administered in conformity 
with the instructions and time limits given in 
each test manual. The six tests were admin- 
istered over a period of two months—from 
July 21, 1949 to September 22, 1949, on an 
average of two weeks apart, in five different 
testing sessions. 


The Subjects and the Trade School 


The subjects in this investigation were 
students at the Knoxville Trade School in 
Knoxville, Tennessee. They were training 
for two automotive trades: namely, Auto 
Mechanics and Auto Body Repair and Painting. 
Due to the rapid turnover of students, and the 
rotation of instructors every three months 
from one class to another, a student, during 
his 18 months of training, was taught by 
several different instructors. It was, there- 


fore, possible to obtain more than one rating 
for each student. 


16 


The ages of the students ranged from 18 to 
5Q years. Their education ranged from no 
schooling through 12 years. 

No student was included in the sample who 
had not been in training for at least two months 
when the criterion of Success was obtained. 
The mean months of attendance for both 
groups is about the same, 11.3 months for the 
Auto Mechanics and 11.2 months for the Auto 
Body Repairmen, 

Only students who were r 
more instructors were includ 
sample, there were 85 subj 


Body Repair group and 105 j 
anics group. 


ated by two or 
ed. In the final 
ects in the Auto 
n the Auto Mech- 


The Criterion 


the trades, Eight Body Re 
ranked the Body Repair stu 
Auto Mechanics instructors ranked the Auto 
Mechanics students. Each 

a different number as well as 
of students, Each student 


d some as man 


Each rank was 
on a 100-point 


Isaac T 


- Littleton 


Effect of Training 


Since there was a range of from zt 
months of attendance in the courses, it a 
felt that this factor of length of training bar 
have affected the instructors’ judgments = st 
students’ abilities, and the performances 0 the 
students on the tests. For that aa ied 
number of months of attendance was corre en 
with the criterion scores and the scores on 
tests. mers 

The only test that correlated significant 
with months of attendance was the ae 
Test of Mechanical Comprehension for the 


s 
$ fe A cient Wa 
Mechanics. This correlation Eeri 
. . vel. 
-28, significant at the one per cent lev i for all 
Partial correlations were computec È 


z vith 
inter-test and criterion-test conelaion m 
months of attendance held constant. Ji tions 
differences between the partial corre han 
and the zero-order correlations were Jes n o 
03. The standard error of a eonda an 
zero for the Auto Mechanics group is Because 
for the Auto Body Repairmen, .10. zero 
of the small differences between He ts i 
order and partial correlation igus o 
was concluded that the effect of mont -tes 
attendance on inter-test and aa 
correlations was negligible. Even the ene 
cant correlation between months of ere 
and score on the Bennett became insign! "teste 
in the partial Correlations involving this 


Statistical Procedure 
The two trainin: 


ar 
epi 
g groups were treated seP 
ately in th 


e statistica] analysis. 


1. The subte 
related with th 


Method, 


h 
: eac 
4. Multiple correlation coefficients for ts 


attery and for the combination of single 
were computed, 


cor 
5 e 
test correlations were ne 


Prediction in Auto Trade Courses 


Table 1 


Multiple Correlations and Bennett-Criterion 
Correlations 


Auto Body 
Auto Mechanics Repair 
Corrected Corrected 
for for 
Autenu- Attenu- 
Raw ation Raw ation 
SRA a a8 $2 
Prognostic Test of 
Mechanical Ability 46 58 9 4 
Single Tests 9 .62 53 59 
Bennett 49 62 32 oF 
worked to determine multiple correlation 


coefficients corrected for criterion attenuation. 


Results 


The basic findings are reported in Tables 
1, 2,3,and 4. Only the criterion-test correla- 
tion coefficients corrected for attenuation are 
reported here. The zero-order correlations 
are reported in the author’s thesis, which is on 
file at the University of Tennessee. 


1. For both groups, the highest correlation 
between any single test or subtest and the 
criterion was given by the Bennett Test of 


17 


Mechanical Comprehension. These correla- 
tions were higher than the multiple correlation 
coefficients of either of the two standard 
batteries. The other three single tests, maxi- 
mally weighted, did not increase significantly 
the multiple correlation coefficients of the 
single tests over the Bennett-criterion correla- 
tion, for either training group. 

2. All three batteries were significantly 
correlated with the criterion scores, in both 
groups, when each subtest or single test was 
maximally weighted. 

3. The combination of single tests gave the 
highest multiple correlation coefficients, with 
the California Prognostic Test of Mechanical 
Abilities second, and the SRA M echanical 
Aptitudes, third. This order prevailed for 
both training groups. 

4. Of all the single tests, the vocabulary 
test contributed least to the multiple correla- 
tions in both groups. This would indicate 
that little may be gained by including a voca- 
bulary test in a battery for predicting this 
criterion. 

5. The multiple correlation coefficients ob- 
tained by using the criterion-test correlations 
after they were corrected for attenuation were 
raised considerably in all cases, thus yielding 
validity coefficients, as distinct from coeffi. 
cients of predictive power. 


Table 2 


Single Tests ł 


Intercorrelations and Criterion Correlations of Four Single Tests, with Means, 
Standard Deviations, and Beta Weights 


Minnesota O'Rourke Purdue 
Bennett Form Vocabu- Industrial 
(AA) Board lary Training Criterion M SD. Beta 
Bennett (AA) = 53 46 Al .62* 45.4 6.5 .59* 
Minnesota Form Board 58 = 46 42 35" 31.9100 og 
Rourke Vocabulary 34 49 a 49 29% 29.8 11.0 .01* 
Purdue Industrial Training 31 53 Al = 29+ 8.2 49 04* 
Criterion . 57* A2* 35* 25 — 49.4 148 
Mean 43.9 31.3 31.5 6.3 49.9 
: ; : : Ru = 62" 
S.D. 7.5 12.3 12.9 4.7 16.2 Ra = 50* J 
Beta .51* ar .03* 1 


a T ð, . 
Corrected for criterion attenuation. 


he figures in the u er right part of the table are for Auto Mechanic trainees (N = 105 

lower left part are for Auth Body Rebate and Painting traimees (N = 85). ( ). 
>M means the multiple correlation coefficient for the Auto Mechanics trainee group; Rg me: 
on coefficient for the Auto Body Repair and Painting trainee 


Correlati 


Those in the 


group ans the multiple 


18 Isaac T. Littleton 


Table 3 


SRA Mechanical Aptitudes + = 
i i es Battery, 
i i RA Mechanical Aptitudes 
i iterion Correlations of the Subtests of the S 1 
ne oe Means, Standard Deviations, and Beta Weights 


= —— 
Mechanical Space Shop | — ; D. Beta 
Knowledge Rennes Arithmetic Criterion M f S D. w 
as 50 9.2 7.8 3t 
Mechanical Knowledge = 44 39 50° = s n A 3t 
Space Relations 1 = 34 34 78 41 .06 
Shop Arithmetic -16 -26 = Bf 494 14.8 
Criterion 4a" 38* 32 os — se 
tents 25.3 16.4 69 49.9 Ba et 
S.D. 8.6 5.8 44 16.2 Ra =e 
Beta 327 .16* 23" pema 


* Corrected for attenu 


t The figures in the upper right 
lower left part are for the Auto Bod 


ation, 


Tt Ru means the multiple corre’ 


correlation coefficient for t 


Intercorrelations and Criterion Cor 


he Auto 


part of the tabl 
ly Repair and Painting trainees (N = 85). 


lation coefficient for the Auto Mechanics trainee group; Ry m 
Body Repair and Painting trainee group. 


Table 4 


rognostic Test of Mechanical Abilities t 
relations of 


P 


in t 
M = 5). Those M 
e are for Auto Mechanic trainees (N = 105). 


he 


iple 
cans the multip! 


anical Abilities: 
the Subtests of the Prognostic Test of Mechanical 
with Means, Standard Deviations, and Beta Weights 
Measure- 
ment eta 
Arith- Blueprint Tool Space witha SD. B 
metic Reading Usage Relations Ruler Criterion M sere 08* 
Arithmetic = 30 2 B a ar O58 At 
Blueprint Reading 56 — Al 07 44 45* oe si A 
Tool Usage Al 59 — 19 44 39% 16.8 3 “gt 
Space Relations 265 57 AT — .04 .03* 7.2 24 30" 
Measurement with a Ruler 36 i52 39 A2 — 50" 7.0 a 
Criterion .04* 37* .28* 47* .18* ass 49.4 14.8 
ca 6.3 6.6 16.2 7.2 7.2 49.9 a, 
D. 2.9 3.6 3.3 2.4 4.3 16.2 
Beta —.26* .29* -06* 40*  —.08* 
i Dorrected for attenuation. `. hose iD the 
© heures in the upper tight part of th tabl i i = 105). Tho 
lower left part are for the Aut «i table are for Auto Mechanics trainees (N = 3 
+t Ry. meant ne uto Bo Repair and Paintin 
correlation 


6. After the unreli 
of criterion sc 
for attenuation, 
Predicted about 


‘abilities 6 
ores were rem 


oved by correcti 
the batteries 


Auto Mechanics tr. 


s t 
ki ainee group; Rg means 
ing trainee group. 


f the two sets 


; ng 
and single tests 


sl 
the combination of single tests yele 
higher correlations for the Auto , 


he mii 


- ghtlY 
A nic? 


ou?” 

ir g” h 

roup than for the Auto Body Repa ous 

equally well for oth trad : à ive OH ae 
i ; e 7. The evidence is not conclusi re 

ia the SRA malpie S eT of only ONS to ma Positive statements about the Tents 

trades Boere ‘te "relations for the two tive values of the various tests and ee pare 
r rognosti : see ns 
Mechanical Abilities, the = nn et ìn the prediction of the criterion. a gt 

, 


i wi 
Son of the correlation coefficients 


Prediction in Aulo Trade Courses 19 


indications of these values, however. - These 
data tend to confirm other findings that 
mechanical knowledge is an extremely im- 
portant component in mechanical aptitude. 
The space relations tests also yielded some- 
what higher correlations than did the arith- 
metic and blueprint reading tests. 


Received March 5, 1951. 


References 


if Bennett, G. K. Test of mechanical comprehension, 
Form AA. Psychological Corporation, 1940. 

2. Cureton, E. E. Validation against a fallible crite- 
rion. J. exp. Educ., 1933, 1, 258-263. 


- Hull, C. L. Aptitude testing. New York: World 


Book Publishing Company, 1928. 


. Lawshe, C. H., Jr., and Moutoux, A. C. Purdue 


industrial training classification test, Form A. 
Purdue University, 1940. 


. Likert, R., and Quasha, W. H. Revised Minnesota 


paper form board test, Series MA. Psychological 
Corporation, 1939. 


. O'Rourke, L. J. Survey test of vocabulary, Form X4., 


Psychological Corporation. 


. Richardson, Bellows, Henry and Company, SRA 


mechanical aptitudes, Form AA. Science Re- 
search Associates, 1947, 


. Wrightstone, J. W., and O'Toole, C. E. Prognostic 


test of mechanical abilities, Form A. California ' 
Test Bureau, 1947. 


— 


r $ 
cy of Students’ Reported Honor Point Averages 
Accurai 


Marvin D. Dunnette 


Industrial Relations Center, University of Minnesota 


ye scholastic achievement of university 

J, A A often denoted in terms of a 
ME dus: honor point average. In 
SSS easing. situations, a student’s trans- 
yee OT is not immediately available 
and reliance must be placed on his own report 
of his honor point average. The accuracy of 
this report depends not only on the degree of 
conscious or unconscious distortion which he 
may introduce into his statement, but also 
on the degree to which he actually knows the 
true value of his index of achievement. ‘The 
present analysis was undertaken in an effort to 
determine the amount of confidence which 
may be placed in such reports, 


Previous Studies 
Previous studies ave been con- 
reliability of 
variety of situations 
involving reports by students and adults 


astic standing, work histories, 
age, previous earnings, etc, 


Vaughn and Reynolds (4) were interested 
in determining the Teliability of public opinion 
interviewers’ Teports of age, education, and 
Socio-economic level, he results of their 
Study indicate that interviewer ratings of 
Socio-economic level are much less reliable 


nd educati i: 
liability runy 


gratifyingly high, 


were 
y of work 
by unemployed 
ucted į 


as completed wh: 
iching Assistant inthe st, while the 


n the J 
y of Minnesota, 1950-51 ie 
Tger stud 


in i g Conducted in 
ià. requirements j 


of a la 


20 


istory aS 
to +.98 for such details u oo ten tb is 
wages, duration of job, and job ie aware o! 
evident that these workers were w = and that 
the factual details of previous jobs macy 
they reported them with high per: 

Krueger’s study (2) is related te veudent 
cation of test scores among college students’ 
When grading errors were made he students 
test papers, only 10 per cent of T reporte 
whose papers were graded too UEN of stu- 
the discrepancy, whereas 99 per = reporte 
dents graded too low discovered Sm sive that 
the errors. The evidence is decisi , with 
“self-interest” dictates the accuracy 
which such errors are reported. aecked th 

Paterson and Thornburg (3) he e of eng 
accuracy with which entering heir scholas 
neering freshmen reported the 


he 


we 
+hether they, of 
Standing with reference to whethe of | 


rer th 
in the top third, middle third, apioa c ean 
their graduating class. The ye fresh™ 
indicated that the reports of ae 
showed little correspondence wits nl” 
standing in high school. RIRAN ijs sho 
burg concluded that such self-rel 


d 
a 
uld Ie ed 
pea 


ies 0 
studies OË ~; 
from persons. However, the st 


1 in setlne ive) 
studies were both performec ; 


$ more 
which there may have been bee s 
toward distortion than in the 


The Present Study 


This study was performed in 
with the administration of io te 
form of an engineering am ae Ins ot 
test was given to 203 seniors 1n f Minne ost 
of Technology at the University © to the 
Each senior was asked to indicate; 


sof 

0 
conju” ian! 
explo™ iP 


st. 


Aecuracy of Students’ Reported Honor Point Averages 21 


of his ability, his honor point average! based on 
all the courses he had taken as an under- 
graduate. The specific directions given to the 
group were as follows: “Now somewhere on 
your answer sheet, please put down your best 
guess of your total honor point average. It 
doesn’t have to be correct to the nearest hun- 
dredth, but do the best you can.” Previous 
instructions had indicated that the test to be 
given was exploratory in nature and that the 
results would not be used against the student 
in any way. This carried with it the implica- 
tion that the honor point averages were needed 
only for purposes of test validation. Thus, 
the actual recording of the estimated averages 
was probably free of any external reward or 
punishment involvement. Of the 203 stu- 
dents, only six failed to give this information. 
The true honor point averages were then ob- 
tained from the files of the College of Engi- 
neering. ‘These were compared with the 
students’ estimates by computing the product- 
moment correlation and by analyzing the data 
by means of a scatter diagram. 


Results 


Table 1 shows the relationship between the 
reported honor point averages and the actual 
honor point averages. The Pearson correla- 
tion coefficient between these two measures 
is +.94. Thus it may be concluded that the 
Seniors were well aware of their true overall 
average. The correlation is higher than that 
Which would have been obtained, had the 
Students relied merely on their previous 
Quarter’s grades for the formulationof their 
estimates. However, we may yet ask what 
the degree of accuracy of the reported averages 
1S. The high degree of association does not 
insure high correspondence between the ab- 
Solute values of the two measures. It guaran- 
tees only a constancy of rank placement of 
one with respect to the other. Thus, it would 

© Possible for the distribution of reported 
averages to be displaced constantly upward 
Or downward with respect to the distribution 
of true averages. That this did indeed occur 
within a subgroup of the seniors is clearly 


. 
of n% honor point average is calculated on the basis 
one Fae honor points for each credit of A, two for B, 

Or C and zero for either D or F. 


Table 1 


Relationship between Reported and True Honor Point 
Averages for 197 Engineering Seniors 
Note: Pearson r for ungrouped data = +.94 


| True HPA 
Reported |_ are 5 = 
HPA | 0.2- 0.6- 1.0- 1.4- 1.8- 2.2- 2.6- 
0.59 0.99 1.39 1.79 2.19 2,59 2.99 
2.6-2.99 fe if g 
2.2-2.59 | 1 [DIri 12 
1.8-2.19 | 4 rel- w 21 
1.4-1.79 E 42 
1.0-1.39 87 
0.6-0.99 28 
0.2-0.59 2 
Total | 8 54 68 34 16 11 6 |197 
1 


evident in Table 1. The students whose ac- 
tual averages are below 1.00 tend, as a group, 
to suppress this fact. Thus, of 62 students 
whose true averages were below 1.00, only 30 
reported their averages as such. This dis- 
tortion did not occur at the other end of the 
distribution. Thus, of 28 students who re- 
ported averages above 2.00, 27 actually did 
have such averages. 

In view of the accuracy with which the rest 
of the group reported, it may be concluded 
that the distortion within the lower group is 
a reflection of a tendency toward conscious 
or unconscious falsification. Since a “C” 
average is required for graduation, this tend- 
ency toward overestimation may be caused 
by the student’s desire to be on the “safe side.” 
It is interesting, however, to speculate on the 
degree to which this result may indicate a 
tendency on the part of the low ability student 
to fool himself through a rationalization of 
his poor grades. At any rate, the results 
within this subgroup are surprising when it is 
remembered that the directions were such as 
to minimize the incentive toward falsification. 

Tn contrasting this study with previous ones 
reported, the results bear out the high accuracy 
of work history data reported by Keating, 
Paterson, and Stone. The marked distortions 
reported in the study by Paterson and Thorn- 
burg were probably due in, part, to the fact 


that students were asked to indicate whether 
they fell in t 


2 Marvin D. 
their graduating classes. A student, even 
though he knows his scholastic average very 
well, may yet lack the specific information 
which would allow him to indicate accurately 
his percentile standing in the class. Thus, 
a student believing himself to be near the next 
higher group will, in the absence of normative 
data, place himself in that group. A large 
amount of distortion in the early study may 
have been due to such a lack of normative 
information. 

The present study indicates quite con- 
clusively that most graduating seniors know 
their total honor point average with a high 
degree of accuracy and so report it. However, 
the distortion which occurred within the group 
characterized by low academic standing points 
to the fact that the accuracy of such self-re- 
ports differs according to the magnitude of 
the honor point average. Persons interested 
in such scholastic information would do well 


Dunnetle 


perhaps to place full reliance only in ss 
reports which indicate marked Ss 7 ra 
iority. For it is within this range S s 
dents tend to know their average mos sal 
cisely, and their feeling of pride cence ce 
their achievements evidently precedes a 
occurrence of distortion in either direct 


Received A pril 30, 1951. 


References 


7 1 Stones 
1. Keating, Elizabeth, Paterson, D. G., and yy 


obtaine! 
34, 6-11, 
in correcting 


C. H. Validity of work histories 
interview. J. appl. eae he 
2. Krueger, W. C. F. Students’ honesty 1,5 
oles errors. J. appl. Psychol., 1947,31, i 
535. High scho? 
3. Paterson, D. G., and Thornburg, P. M. engineerin 
scholarship standing of freshman: 308-8! 
students, J. engng. Educ., 1927, Reliability 9 
4. Vaughn, C. L., and Reynolds, W. A. Podi 951, 
personal interview data. J. appl. PS) 
35, 61-63. 


Validity of Minnesota Occupational Rating Scales * 


Harold 


Geist 


Stanford University 3 


Many psychologists have been interested 
in relating test scores to occupational require- 
ments. The Minnesota Employment Stabi- 
lization Research Institute pioneered with a 
Standard battery of tests. Profiles were made 
for various occupational groups on intelligence, 
clerical, mechanical, spatial and manual dex- 
terity tests (2). The Occupational Analysis 
Division of USES carried this technique still 
further. For selection purposes, batteries 
of the most valid tests which had been de- 
Veloped were used in varying combinations 
to establish profiles for each job studied. For 
guidance purposes, a standard battery was 
administered to persons employed in various 
families of occupations and patterns of apti- 
tudes were ascertained (3). Occupations were 
Classified according to occupational families 
In such a manner as to make some two hundred 
Profiles represent approximately two thousand 
major o¢cupations. Profiles were based on 
critical minimum scores rather than mean 
Scores. ‘Tests in the battery were selected on 
the basis of factor analysis studies of vocational 
aptitudes involved in occupations in various 
key parts of the country rather than in one or 
two localities, 

The limitations on the small number of 
Occupations in the Minnesota Employment 
Stablization Research Study was ‘partially 
remedied by the publication of the Minnesota 
Occupational Rating Scales in 1941 (4). 

The 1941 Minnesola Occupational Rating 

cales contain a list of four hundred and thirty 


* 
Th 

r. G ea 

and 

Direct, 
ord 

ctrist, 
enter, 


uthor wishes to express his indebtedness. to 
€orge Barahal, Asuan Professor of Bducalign 
sychology, Wayne University, and ormer 
or of the Counseling and Testing Center, j a 
niversity, and to Miss Patricia James, Psyc. or 

Stanford University Counseling and Teste 
rec for permitting him to examine the comae ing 
pr ords while employed there. He also wishes to a 
Rae appreciation to Dr. H. B. McDaniel, Rruthesor a 
reyication and Psychology at Stanford University, z 
tio; iwing the manuscript and offering valuable SURES: 
Cling e author is at present serving as hie 
M ue Psychologist, Mare Island Naval Hospital, 

ane Island, California. 


23 


occupations, each classified according to mini- 
mum requirements with respect to six abilities: 
academic, mechanical, social intelligence, cleri- 
cdl, musical talent, and artistic. The defini- 
tions of these abilities as stated in the Manual 
(4) are: 


1. Academic Ability. By academic ability 
is meant the ability to understand and manage 
ideas and symbols. 

2. Mechanical Ability.. Mechanical ability 
includes both the ability to manipulate con- 
crete objects, to work with tools and machinery 
and the materials of the physical world, and 
the ability to deal mentally with mechanical 
movements. 

3. Social Intelligence. By social intelligence 
is meant the ability to understand and manage 
people, to act wisely in human relations, 

4. Clerical Ability. By clerical ability is 
meant the ability to do rapidly and accurately 
detail work such as checking, measuring, classi- 
fying, computing, recording, proof-reading, and 
similar activities. 

5. Musical Talent. Musical talent requires 
the capacity to sense sounds, to image these 
sounds in reproductive and creative imagina- 
tion, to be aroused by them emotionally, to be 
capable of sustained thinking in terms of these 
experiences, and ordinarily the ability to give 
some form of expression in musical ‘perform- 
ance or In creative music. 

6. Artistic Ability. Artistic ability refers 
both to the capacity to create forms of artistic 
merit and the capacity to recognize the com- 
parative merits of forms created. 


A review of the literature reveals that no 
validation has been made of the Minnesota 
Scales. The authors of the Scales modestly 
claim that “these judgments are believed to 
yield information of value in our struggle to 
understand occupational requirements in terms 
of human abilities.” It js because these 
Scales are of the “armchair” analysis type 
that some counselors are hesitant to use them 
as valid psychometric aids. 


Present Study 


The present study was undertaken to de- 
termine to what extent the profiles of the 


24 


Rating Scales agree with the test profiles of 
a selected group of counselees. An arbitrary 
grouping of the four hundred and thirty oc- 
cupations in the Scales was made according 
to the groupings of the Dictionary of Occupa- 
tional Tilles, i.e., each occupation was placed 
in a category of the DOT. Then the letter 
grade of each ability group was converted into 
a mean number grade for that category. For 
example, the first occupation in the Rating 


Scales is Accountant which has the following 
rating for the ability grouping: (a) Academic, 
A; (b) Mechanical, D; (c) Clerical, A; and 
(d) Artistic, D. 

For reasons to be stated later, only academic, 
mechanical, clerical, and artistic abilities were 
chosen. The rating of each of the ability 
groups was taken as the midpoint of the range 
of that letter grade. For example, in the 
Accountant occupation, A was considered to 
be 96.5 percentile, since the range as defined 
by the authors was 93-100 and D for mech- 
anical ability was taken as 12.5 percentile 
since the range was 1-25. The clerical and 
artistic ratings were similarly treated. 

The following arbitrary ratings were taken 


for each ability group, as defined by the authors 
of the Scales: 


A=96.5 percentile for academic 
ical and clerical abil 
ability. 

B=83.5 percentile for 
ical and clerical abi 
ability. 

C=50.5 perce 
ical and clerical 
ability. 

. D=12.5 percentile for academic, mechan- 
ical and clerical abilities, and 12.5 for artistic 
ability. 


» mechan- 
ities, and 98.5 for artistic 


academic, mechan- 
lities, and 93.5 for artistic 


ntile for academic, mechan- 
abilities, and 58.5 for artistic 


A total of 150 counseling records were selec- 


ted at random from the files of the Stanford 
University Guidance Center. The vocational 
objectives stated on these recor 
into DOT categories. 

criteria of each record 


Harold Geist 


In addition, as many records as possible 
were obtained which also contained the score 
on the Minnesota Spatial Relations Test. 

The following are the type of tests which the 
authors of the Minnesota Occupational Rating 
Scales state have been used in securing data 
for each of their categories: academic ability, 
tests of intelligence and academic achievement; 
mechanical ability, mechanical ability ien 
clerical ability, clerical aptitude tests; =. 
artistic ability, tests of art talent and ar 

ment. o, 
— be observed that two of the oer 
groupings, social intelligence and por 
ability, have been omitted. It was felt t X, 
although the authors of the Scales slete 
personality and interest tests were use ; 
derive their ratings, the present state of Ae 
tests in the determination of “social intel 3 
gence” is so highly questionable that they ye 
not used. Tests of musical talent were ide 
at the Stanford Guidance Center to such 
small number of counselees that this ability 
category was necessarily ignored. 


Reasons for Choice of Tests in 
Ability Categories ’ 
The choice of tests to compare with the ™ a 
sults of the Minnesota Occupational ere 
Scales was a difficult problem, The aaka re 
of the Scales did not state which tests va 
used in their final summation. Tests wae 
chosen which scemed to fit most neatly ae 
description of the authors and at the same led. 
were adequate for the population SE 
The principal problem in comparing te ae 
sets of data was that of securing tests W siti 
horms most nearly resembled or agreed Aae 
the vague “general population” norms 0 e 
Scales. Consequently, those tests were ee 
upon which the scores most nearly represe 
the “general population.” Tests finally 5€ 
ted for the individual categories were: of 
1. Academic Ability. Edition gi- 
the AGCT. The AGCT Test (Civilian "4 
tion) was chosen since it is a test of ge? 
learning ability and also is considered 
one of the better group intelligence tests. 
Correlation of the AGCT with other well ! Arm) 
tests of general learning ability is high (A to 
Alpha = .83; Otis = -79). It also seeme in 
meet the specification of standardization 
what closely approximates a general popula 


Civilian 


ra 


jon 


Validity of Minnesola Occupational Rating Scales 


ships.’ 
in 


So far as is known to this author, 
st of mechanical ability which has 


there is no te. 
Seneral population” norms meeting the re- 
quirements of the specifications of the authors 
of the Minnesota Scales. - ` 
was clerical. The Minnesota Clerical Test 
uring cleric because it is an aptitude test meas- 
the Amel ide speed and accuracy and is one ot 
fully as erical tests which has norms on gain- 
y upied adults. 
was Ey The choice of an adequate art test 
ing ent te difficult one. After aoa 
uidane al art judgment tests at the Stanfor 
Meier- Se Center, it was decided to use the 
“raves Art Test. According to the 
est, its purpose is to measure 
d traits or abilities, manual skill, 
al perseveration, aesthetic judgment or 
ence, perceptual facility, creative imagi- 
» and aesthetic judgment. This quali- 


Professional 


Acad Mech Cler Art. Acad, 


Percentiles 


Fic. 1. Comparison of 


Professional, Manager! 


to 


tative description far exceeds the requirements 
of the Minnesota Scales. Unfortunately the 
norms are on college adult art students and 
not general population norms. 


Results 


For each of the nine categories of the DOT, 
the profiles of the Minnesola Scales were 
plotted against the corresponding mean empir- 
ical test scores converted to percentiles. 
Differences were plotted in terms of plus or 
minus between the Minnesota Scales and the 
mean test scores. Figures 1, 2, and 3 furnish 
the data for the analyses which follow: 

Professional. In the professional category 
the profiles are fairly similar, The chiet dis- 
crepancies seem to be in the academic ability 
categories and in the artistic category. It 
appears rather surprising that the mean per- 
centile scores of the experimental group is 
below the estimate of the Minnesota Scales. 
It is probable that the Minnesota Scales are 
more nearly correct since the majority of pro- 
fessional workers in Stewart’s study (5) are 
1.5 sigma above the mean, which would place 
them at about the ninetieth percentile. 


Sales 


erial 
Acad. Mech. Cler 


Mana 
ler Art. 


Mech Art. 


Minn. Scales 


Experimental 


Minnesota Scales with Empirical Test Scores of 


ial, and Sales Groups. 


26 


Managerial. It is in the managerial group 
that the similarity between the test scores of 
the experimental group and the Minnesota 
Scales is most striking. The profiles are almost 
identical. The academic ability test scores of 
the experimental records is higher than the 
estimate of the Minnesota Scales. The reason 
again is probably because of the population 
sampled. 

Sales. In the sales group the resemblance 
between the two profiles, although not quite as 
similar as that of the managerial, are almost 
identical. It is probable that the discrepancy 
between the artistic ability groupings is due to 
the large number of salesmen of artistic equip- 
ment and implements included in t 
mental records. 

Services (Protective). In this group, the most 
striking difference is in the clerical grouping. 
The protective service category of the DOT 
consists mainly of police and allied personnel. 
In the higher “bracket” positions, it is doubt- 
ful that a high clerical ability (62nd percen- 
tile, Minnesota Scales) is needed for this oc- 
cupation, although it is conceivable that in 


he experi- 


Protective 
Acad, Mech. Cler: 


Perso 
Art. 


Percentiles 


Acad. Mech. Cler. 


Harold Geist 


such a position as desk sergeant, a high order 
of clerical ability ‘is desired. An overall view 
of this group would seem to indicate that thie 
experimental group would more nearly TE 
resent the clerical ability level required © 
people in protective services. . wa 
Services (Personal). Here there is the grea 
est discrepancy between the Scales and thé 
experimental group. It is probable that m ; 
group where the skills required are of a rat a 
low order, any attempt at prediction 1s 
best nebulous. Consequently, it is felt gies 
the true profile probably lies between the T 
perimental group and the Minnesota aya a 
Services (Domestic). Since no domestic A 
jectives were chosen, this category was omt i 
ted. This was likewise true of Building 
Service Workers and Porters. af 
Agricullure. Since the experimental gro k 
only included objectives in the agricultur 
and horticultural occupations, the pen 
forestry, hunting and trapping eae 
were omitted. It will be observed ihat 
shapes of the profiles of the experimental pE 
Minnesota Scales are similar, but the disc 


nal Service 


Agriculture 
Art. 


Acad. Mech. Cler Art. 


—__ 


Fic. 2. Comparison of 


Protective, Personal Se: 


Minnesota Sı 


Minn. Scales 
Experimental 


cales with Empirical Test Scores of 
Tvice, and Agriculture. 


Validity of Minnesota Occupational Rating Scales 27 


Semi-skilled 
Acad Mech. Cler._ Art. 


Percentiles 


Fic. 3. Comparison of 


Skilled 
Acad. Mech. Cler Art. 


Minn. Scales 
---- Experimental 
-Experimental 


Clerical 


Acad. Mech. Cler Art. 


motor dexterity 


Minnesota Scales with Empirical Test Scores of 


Semiskilled, Skilled, and Clerical. 


i are large. There are probably two 
oe for this. The first is the difference m 
the ime grouping. The second is that 
Mate] M™nesola Scales were published coe 
vised y ten years ago, and the Scales were de- 
skill Some time before that. The ie 
Eca necessary for agricultural workers today 
aem o the mechanization of agriculture 
Breat Scessitated farmers in general to pone 
espe = skills in the ability groups samp ba 
tien tally in the mechanical area, Where the 
Se ed is the greatest. i 
definitio and Semiskilled. The abbreviate 
Bories lon of the skilled and eT mT 
“ma according to Shartle (6) are: Skilled, 
tequi,, des craft and manual occupations tha 
hensive Predominantly a thorough and — 
Work © knowledge of processes involved we ia 
iia € exercise of considerable indepen eni 
extern usually a high degree of manua 
Sponsibiy and in some instances extensive re- 
Senge T for valuable products oF rice? 
Manin, and Semiskilled, “The exercise 0 
li ie ative ability of a high order, but 
“d toa fairly well defined work routine. 


These occupations may require the perfor- 
mance of part of a craft or skilled occupation, 
but usually to a limited extent.” 

It was felt that the addition of a motor 
dexterity criterion was necessary to complete 
this category. Consequently, all cases in 
this category had in addition to their other 
tests, the Minnesota Spatial Relations Test 
results. Since the Minnesota Spatial Rela- 
tions Test norms are in percentile ratings corre- 
sponding quite closely with the Minnesota 
Occupational Rating Scales, the results fitted 
in quite closely with the other test results, 

The most outstanding difference between 
the two profiles is the discrepancy in the 
academic ability category. It appears that 
the estimate of the Minnesota Scales is low, 
while the experimental group is high, the true 
points probably lying somewhere between. 
In the theoretical mechanical group, the re- 
sults of the experimental group anc the Minne- 
sota Scales are identical. The experimental 
group reached the sixty ninth percentile on 
the motor dexterity test. The results of this 
test could not be integrated with the mech- 


28 Harold Geist 


anical comprehension tests, for although they 
are lumped together by the Minnesota Scales 
they tap essentially different abilities. . 

Skilled. In the skilled category, one item 
is outstanding. The academic and mechanical 
ability (both theoretical and motor) groupings 
of both the Minnesota Scales, and the experi- 
mental group are practically identical. The 
large discrepancies in the artistic and clerical 
categories might be explained by the difference 
in sampling. 

Clerical. The clerical category shows a 
distinctly higher rating on the Minnesota 
Scales in the academic ability grouping. Since 
the majority of those in the clerical category 
in Stewart’s study (5) lie one sigma above the 
mean, the Minnesota Scales appear to be more 
nearly correct. A tendency to underestimate 
the intellectual requirements of clerical person- 
nel by counselors in a college guidance center, 
may affect the sample. The percentile ratings 
of the clerical grouping are identical or practi- 
cally identical as would be expected in this 
category. 

Summary 

1. The 430 occupations of the Minnesola 
Occupational Rating Scales were divided into 
nine categories of the Dictionary of Occupa- 
tional Titles. 

2. Each of the categories was subdivided 
into four of the six ability groupings of the 
Scales. 

3. The mean score of each ability group for 


all categories was computed and profiles made 
for each of the categories. 


4. An experimental group from the Stanford 
University Guidance Center was selected and 
the objectives were likewise divided into the 
nine categories of the Dictionary of Occupa- 
tional Titles. a 

5. Test results were obtained on those ability 
groupings chosen for the Scales. ‘These test 
results were selected on the basis of the tests 
which most nearly represented the definition 
of tests by the authors of the Scales. 

6. Test results of the experimental groui 
were compared with the finding of the Scales 
by profiles. j “a 

7. The test profiles of the experimen « 
group agreed quite closely with those of th 
Scales. oat 

8. Further research in this field is indicated- 


Received March 5, 1951. 


References 
1. Bennett, G. K., and Fry, D, E. Manual for ust of 
mechanical comprehension, Form BB. New YO 
The Psychological Corporation, 1941. bility 
2. Dvorak, Beatrice J. Diferential occupational a tion 
patterns. Bulletin of Employment Stabile a 
Research Institute, University of Minnes 
1935, Volume IIL, Number 8 (out of pa ti- 
3. Dvorak, Beatrice J. The new USES Genemi Bi, 
tude Test Battery. J. appl. Psychol., 1941, 
372-376. A 
4. Paterson, D. G., Gerken, C. d’A., and Hahn, M. & 


4 s Chir 
The Minnesota Occupational Rating a 
cago: Science Research Associates, 1941. nnel 
5. Stewart, Naomi. 


AGCT scores of Army POT í 
grouped by occupations. Occupations, 194 
541. acai 

6. Shartle:C. L. Occupational information. New Y 
Prentice-Hall Co., 1946. 


Lecture vs. Group Decision in Changing Behavior * 


Jacob Levine 
V. A. Hospital, Newington, Conn. 


and 


John Butler 
Trinity College, Hartford, Conn. 


To the industrial or group leader who is 
seeking to change the behavior or attitudes of 
people, psychology has had little to offer in the 
way of practical techniques or guiding prin- 
ciples. One of the few important contribu- 
tions to this problem was that of Lewin (1) 
when he compared the relative effectiveness 
of group decision with formal lectures in in- 
fluencing a group of women to change their 
Cating habits during the war. His findings 
indicated that group decision was the more 
effective method. However, Lewin recognized 
that his results may have been due to a differ- 
ence in expectation between the two groups, 
for the group decision group had been told that 
a later inquiry would be made as to whether or 
not the members had carried out the suggested 
changes. No such information was given to 
the Lecture group. This forewarning of a 
Subsequent checkup may have had some in- 

uence on the decision of the first group to 
change with a consequent bias in the result. 

Studies on prejudice and social attitudes 

ave demonstrated that education in itself 
does not reduce prejudice nor change attitudes 
Significantly. See Samelson (4). The com- 
p €x relationship between learning, perception, 
and motivation is no more dramatically illus- 
trated than here, where learning and correct 
Perception can occur without leading to signi- 
niin action. Were the acquisition of knowl- 
He alone sufficient to lead to behavior 
Dia? many individuals would not be og 
and ng again and again the same persona ly 
tho Socially disastrous behavior patterns 

Ugh they well know that different behavior 

* 

Direo Pmitted with the approval of the Chief areri 
e » Department of Medicine and Surgery, Vet 


Tans May t 
the Adi inistration, who assumes no responsibility for 


ee in thi vhi those of 
the authors. expressed in this paper which are 


would lead to more successful social relations. 
Within this problem lie hidden some of the 
most crucial problems of human adjustment 
and learning. 

Though we recognize the importance of moti- 
vation as well as the acquisition of knowledge 
in social change, it is often difficult to deter- 
mine the changing motivational factors in the 
specific situation. This cannot be omitted 
from the understanding of such problems as to 
why group decision is more effective as a be- 
havior modifier than is the formal lecture. 
One can talk about greater ego involvement 
in the one case but just how this is related to 
motivation and to action is far from clear. As 
Lewin has pointed out, a higher degree of ego 
involvement does not necessarily lead to a 
decision to act. He suggested that perhaps 
in group decision the members are more likely 
“to make up their minds” or reach a decision. 
And though the making of a decision takes 
but a minute or two, once it is made, “it has 
an effect of freezing this motivational con- 
stellation for action.” But this explanation 
does not tell us how it is that the individual 
makes a decision to act more readily in the 
group decision than he does in the lecture. 
Though in each case, the translation of de- 
cision into action was ultimately made by the 
individual, it would seem that the step from 
the absorption of basic information to the 
making of a decision was more of a group pro- 
cess in the group decision method than it was 
in the lecture method. 

The present experiment was designed to 
repeat that of Lewin in a diffe~ent setting 
under carefully controlled conditions of in- 
formation given and behavior changes meas- 
ured. In this study, group decision is com- 
pared with formal lecture as a method of 


30 Jacob Levine and John Butler 


producing changes in socially undesirable 
behavior. Both methods are then compared 
with one in which no attempt is made to bring 
about any change. Thus, the experiment was 
designed to answer two questions: 1. Is the 
acquisition of knowledge enough to lead a 
group of individuals to change a socially un- 
desirable behavior pattern? 2. Is group de- 
cision a more effective method of producing a 
change in behavior than is the formal lecture? 


The Experiment 


The subjects consisted of 29 supervisors of 
395 workers in a large manufacturing plant. 
The workers were on an hourly rate. These 
factory workers represented a wide variety of 
jobs and skills, ranging from unskilled manual 
labor to the most highly skilled machinist and 
toolmakers. All of these jobs were classified 
into nine different grades on the basis of skil) 
and training required. 

Within each job grade three different hourly 
Wage rates prevailed. The particular rate 
paid to any worker was determined in large 
part by the quality of his performance on the 
job. Performance was evaluated by one of 
the 29 foremen who supervised the work of 
these 395 men. Every 6 months each worker 
was rated by his foreman on established rating 
scales for 5 factors: 1, Accuracy; 2. Effec- 
tive use of working time; 3. Output; 4. 
Application of job knowledge; and 5. Co- 
operation. The sum of the Scores on each of 
these five scales comprised a worker’s total 


performance rating and determined what wage 
rate he would get, 


Unfortunatel 


halo effect” re- 


most effective method of getting these super- 
visors to change the basis for the ratings i 
that a more equitable rating system woul 

prevail. Our objective was to help ~ 
supervisors see that their task in rating wn 
worker was to consider only how well he o 
his job and not how dificult the job was. ; e 
was to understand that he was to rate the man 
and not the job. The task of the present zi 
periment was to determine which was the mor i 
effective method of achieving this change 1 
behavior of the 29 rating supervisors, group 
decision or the formal lecture. 


Experimental Procedure 


The 29 supervisors were randomly divided 
into three groups of 9,9, and 11. It may a 
pointed out that all supervisors were opper 
enced raters and had been rating employ a 
for a number of years. The first group, ai 
A, consisting of 9 supervisors of 120 Ye eG 
served as a control group, and receive Pe 
special instructions prior to TRENE, Jere 
second group, Group B, consisted of 9 SU} ion 
visors of 123 men and served as the pase a 
group. The third group, Group C, or as 
of 11 supervisors of 152 men, and served 
the lecture group. s 

Several days prior to rating, the memben 
of Group B were gathered together range. 
table with the discussion leader. ‘The ‘id he 
did not sit at the head of the table nor er 
lead the discussion, He introduced the ae 
blem by showing a graph of the previous tha 
ing and raised the question why it ent 
the higk skilled workers were consis ov 
rated higher in performance than ele 
skilled. From that point on, the leader ™ him 
acted as moderator and avoided injecting ani 
self into the discussion. All decision 4 
opinions were made solely by the group me a 
bers. The discussion lasted one hour fi eas 
half. The group expressed a number 0 hey 
and arrived at several conclusions. the 
finally reached one decision acceptable im 
group: The way to avoid the inequali the 
rating was to disregard the difficulty % , 
jobs and rate only the man doing the 
Consideration was to be given only se f 
well a worker was doing his job. All 2 
bers agreed on this decision, 


ob: 
now 


Leclure vs. Group Decision in Changing Behavior 31 


Group C, the Lecture group gathered in a 
formal lecture room and all sat facing the 
leader, They were given a detailed lecture 
on the technique and theory of employee per- 
formance rating. Some background material 
on wage administration and job evaluation 
Was also included. The lecture carefully 
Pointed out the errors of their previous ratings 
and interpreted the reasons for their occur- 
Renee, He illustrated his lecture with graphs 
= figures. He finally explained what each 
fn diy Hs Supposed to do: that he was to rate 
the eee performance and not difficulty of 
EE ata After the lecture, questions were 
ia and asked by the raters; complete 
abo ers were given. The total session lasted 

Ut one hour and a half. 


T 
vis 


Ta 

tine we see the gradual decrease in mean 

alo =e down in the labor grade. This 

experi ect” characterizes each of the three 

rating ental groups. In the post-trammg 

i ees changes are observable. 

the mne sake of simplification of the rene 

into tw abor grades were arbitrarily divide 

Group. t Categories: Low Group and High 

in the pe first four labor grades were placed 
© High and the last five in the Low. 


and the High labor groups prior to training we 
find that the difference is significant at the 1% 
level of confidence for two groups and «t the 
7% level for the third. In each case the size 
of the difference is at least one-third of a rat- 
ing unit in a total of three units. 

In the second rating, only Group B shows 
any significant change in difference in the mean 
ratings of Low and High Grades. For Group 
A, the control group, the mean difference re- 
mains almost the same, and is still significant 
to the 1% level of confidence. Group C, the 
lecture group, shows a small decrease in the 
difference, but it is still significant to the 1% 
level. 

We may conclude that performance ratings 
were significantly affected only after the raters 
had had a group discussion and had reached a 
group decision. Neither increased experience 
in rating nor the learning about their previous 
errors in rating had any significant influence. 
Our findings completely confirm those of Lewin 
in demonstrating the greater effectiveness of 
group decision over the lecture method of 
training. ' 

The relationships between mean ratings for 
pre- and posttraining are shown graphically in 
Figures 1, 2, and 3. For Group A, the two 
curves are seen to be essentially the same. All 
pretraining curves slope downward from high 
to low ratings more or less similarly. In the 
post-training curves it is only the Group B, 
which shows a flattening or equalization of 


“en we compare the mean ratings of the Low mean ratings for the 9 labor’ grades. It is 
Table 1 
j i Groups Before and After Training Sessi 
an Men Rating Differences between Low and High Labor P = Seon 
A Group B Group C 
ee ; Group Decision Lecture 
'ontro 
Mean Rati Mean Rati 
Labor Mean Rating Signif.* seu Signif.* ee Signif.* 
Tade ae Bd O) ; ist 2nd (p) Ist 2nd (p) 
Low > 
31 1.8 1.7 
E 1.7 1.7 :23 
High 1.7 1.7 a 20 1.7 .01 2.4 2.2 323 
Sj fan Dif. 2.0 2.0 . 33 00 -63 45 
gnif * 35 Pi 07 84 01 01 
©) o1 01 


* 
This į z i Te: 
"indo his is the probability that a difference this great or g 


Sampling. 


ater could have arisen simply through errors of 


32 Jacob Levine and John Butler 


n BEFORE 


MYER 22a 
2 
e 
2 
[a 
A 
fd 
M 
8 
< 
f E 
Š v 
ž 
o 1 E 4 S E EEFE 
LABOR GRADE 
Fic. 1. Average rating of raters with no 
training sessions (Group A), 
3 x. 
Y 
ZF amw, 
ž `. 
< s, 
č ee 
w 
š 
z 
o 1 2 3 4 s 7 e 3 
LABOR GRADE 
Fic. 2. Average rating of raters before and after 
group decision session (Group B). 
3 
perome 
\ 
be 
P M E, ie 
: me 
= ig 
¥ = 
3 
Š 
A 
z 
o ' 2 3 4 5 6 ? 
LABOR GRADE d , 


Fic. 3. Average rating of raters before and after 
lecture session (Group ©). 


interesting to note that of the thr 
ee 
Group B lad shown mena 


between high and low 
training. The post-trai 


ings. This reduction might be the result ol 
an increased conservatism in rating by these 
raters as a consequence of the lecture, without 
affecting their basically prejudiced ratings- 


Discussion 

It is clear that group decision Was ae 
effective in reducing the prejudiced ratings © 
these factory supervisors than was the formal 
lecture. This in itself is a significant finding. 
But what seems to be even more striking is the 
fact that the lecture method had practically 
no influence upon the discrepancies in bettie, 
It is generally assumed that once an ee 
or a group of individuals learn that they nee 
been behaving in a socially undesirable we 
they will immediately take steps to ahenn] 
particularly if is clear to these indivedua 
that it is their responsibility to eliminate soci 
errors. Our findings do not support us k 
notion. The acquisition of knowledge ¢ 
not automatically lead to action. ode 

The findings also indicate that once a gro À 
arrives at a decision to act, the members, €Y a 
though they may act as individuals, take n 
that decision and act in accordance with F 
The force of this group decision was evident? 
sufficient to overcome the resistance to l 
in habitual ways of thinking and acting. op 
these group forces were able to operate ae 
the individual the present study does not der 
veal. Further research is necessary to ads 
termine whether or not group decision ae 
to a “freezing of decision to act” whereas 
lecture method does not. 


Summary Mi 
A formal lecture method was compa 
with group decision in inducing 29 super’ ir 
of 395 factory workers to overcome ult5 
biased performance ratings. The w 
showed that only the group of super” it 
involved in group decision improved mM nge 
ratings. The lecture group did not ar a 
and persisted in overrating the more He 
skilled workers and underrating the less ae jo? 
The conclusion was drawn that group ieas 


Lecture vs. Group Decision in Changing Behavior 33 


IS more effective than the formal lecture in 
Overcoming resistance to change in behavior. 
Received April 11, 1951. 


References 
A, Lewin, K. Group decision and social change. In 
Newcomb, T., and Hartley (editors), Readings 
in social psychology. New York: Holt, 1947. 


2. Marrow, A., and French, J. R. P. Changing a 
stereotype in industry. J. soc. Issues, 1945. 1, 
33-37. 

3. Radke, M., and Klisurich, D. Experiments in 
changing food habits. J. Amer. Dietetic Ass., 
1947, 23, 403-409. 

4. Samelson, B. Does education diminish prejudice? 
J. soc. Issues, 1945, 1, 11-13. 


Negro-White Army Test Scores and Last School Grade * 


Byron E. Fulk and Thomas W. Harrell 
University of Illinois 


This study was undertaken in order to com- 
pare the performance of Negroes and Whites 
on the Army General Classification Test 
(AGCT) in World War I. AGCT scores 
used in the study were obtained from Manning 
and Informational Rosters of various organi- 
zations of the Army Air Force in World War II. 
These organizations were part of the Air Force 
Service Command. Included were such 
organizations as Headquarters Squadrons, 
Service Squadrons, Chemical Sections, Signal 
Companies, Quartermaster Companies and 
other organizations concerned with air base 
Activities other than actual flying. 

A White sample of 2,174 scores is compared 
with 2,010 Negro scores. The two samples 
are compared in terms of the means, the 
standard deviations, and the per cent of over- 


lap. The groups have been sul 


b-divided in 
terms of school grade completed and com- 
* 


Fulk carried out the study reported herein 
Master’ 


parisons made at each level. The results o 
the comparisons are shown in Table 1 an 
“igure 1. 
p= scores of the Whites exceed those g 
the Negroes at each grade level. All oa 
differences are statistically significant. a 
lowest critical ratio for a difference was a 
The percentage of Negroes whose scores a 
ceed the median score for the Whites as show i 
in Table 1 is 12 for the entire popuni 
studied. By years of school completed a 
lap varies from 17 per cent at grade 12 to at 
per cent at grade five. Overlap is higher 
the higher grades, beginning with school gra 
ten than it is at lower school grades. dnit 
It is not suggested that because two inc ail 
uals have attended school for an eam e is 
period of time that the factor of schooling > 
thus controlled. This method, however, 


s S tainly 
as his Keeping last school grade constant a the 
er’s thesis under the supervision of Harrell. can be expected to cancel out some 
Table 1 
Mean AGCT Scores of Whites and Negroes by Years of Schooling p = 
ani R m. 
rer 
Nehre Number of Cases Mean AGCT Score Standard Deviation Difference Fe of a 
Sı a li i a a — Between Overlap 
chooling White Negro White Negro White Negro Means O A 
: 242 a81 824 50.4 rr 20 10 
2 = 51 91.2 58.4 16.2 19.6 32.8 9 
3 129 o 884 57.8 20.5 18.6 30.6 6 
4 ao 93 376 23.2 184 33.6 8 
5 if 1al 90.6 59.8 23.8 20.0 30.8 2 
52 139 90.4 5 = sg 
6 215 191 ; 54.6 25.2 11.6 35.8 6 
7 15 TA 88.0 59. 23 140 28.4 A 
8 212 202 85.4 644 15.2 147 21.0 7 
9 115 110 94.5 69.2 15.4 17.3 25.3 6 
10 146 140 oT tga 15.2 166 27.3 it 
11 wmo tg 1025 79.0 142 170 23.5 1i 
12 179479 | i6 144 16.2 22.0 17 
Bandup 137 90 oe 93.0 15.4 17.8 16.2 11 
Total 2 GS 148 167 22.0 P 
174 2,010 : a 
95.1 68.5 21.2 20.7 26.6 po 
* The percentage of Negroes whose scores exci j 


34 


eed the median White score, 


Negro-White Army Test Scores and Las! School Grade 


1204 


= 


1007 


90 


MEAN AGCT SCORE 
o 
° 


x 
o 


a 
o 
= 


È 


p P 


WHITE — 


— NEGRO 


Sa 


E. cas 


\ 
cart oar (aa eam a a 
o 5 


tt 


T t 
8 9 10 


SCHOOL GRADE COMPLETED 


Fic, 1. 


differences 


ed which are usually attributed to 


Neational background. 
tained rosters which provided the data con- 
birth Sa indication of the soldier’s place of 
atio home address, consequently no in- 
on concerning possible differences due 


O. fep? c 
be q Stonal origin or quality of schooling can 
rived, 


TI Summary 
hag p° Performance of Negroes on the AGCT 


en compared with that of Whites. The 


Mea; 
n = 
of the White scores was found to exceed 


Years of schooling and mean AGCT score. 


that of the Negro scores. The difference is 
significant. Significant differences favoring 
Whites were discovered at each last school 
grade, but these differences were less at the 
higher school grades beginning with grade ten. 
Performances of both groups on the AGCT 
were found to be directly related to amount of 
schooling accomplished by the time of in- 
duction into the military service. This re- 
lationship began after school grade five for 
Negroes and after grade seven for Whites. 


Received March 15, 1951. 


Measuring Educational Leadership Attitudes * 


J. J. Valenti 
State Teachers College, Moorhead, Minnesota 


The problem was concerned with the de- 
velopment and evaluation of an instrument to 
measure the attitudes with which teachers and 
administrators view various problem areas 
pertaining to the social role of the teacher. 

By the time they have entered into active 
teaching roles, teachers and principals have 
formulated, consciously or unconsciously, 
certain “philosophies of education—values 
or attitudes—which they use as a frame of 
reference in observing various aspects of their 


teaching jobs. The extent and degree to which 
the attitudes or “philosophies” 


principles within a given scho 
ment has an important bearing on the nature 
of their personal relationship and their rela- 
tionships with others in the school situation, 
This study, then, is more concerned with in- 
formal aspects of interpersonal relations rather 
than the formal aspects. Because of the local 
Structure of our public schools, it would also 
be of interest to consider the sentiments of 
other persons in the educational situation— 
parents, Community, other employees, stu- 
dents. This Study, however, was restricted 


to the attitudes of teachers and administrators, 
The attitude evaluate 


d in this study is the 
r, frame of re erence in his teaching 
ole or the point of view he devel 
various aspects of hi Rata anle 


hing situation, This 
point be b 


of teachers and 
ol are in agree- 


ious styles 
Iven school 
the results of 


Sina g 
From 


i ch an strument, it 
This Teport was 

P aS presente / 5 
Binas the, Minnesota a is Seg a 
= hen ae as ary of a Ph.D thesi n s 
Sie pe valuation a Le ship iita a 
nein ocial Role of the T her, a re 
aoe ity of Chica Library ey ter it the 
Hpbteciation to Professor: W a a 

to ee » Jd 
advisers apse > lis as hi 


36 


m 
5 : aining progr 

was hoped that an in-service me owed 
could be drawn up to modify atu ‘ie existing 
desired ones or at least to recogniz 


«ces of friction. 
ones which may create sources of 


Conceptual Framework 
The conceptual framework E a 
measuring instrument used in H ternationi 
in a previous study (6) of we Universi i, 

Harvester Research Project of the survey 
of Chicago,! was developed from munication 
the literature of leadership, = + appel 4 
and supervision. In the literatur! aches t0 t i 
that there were two general oppe AN ity 

study of interpersonal relations. pind" 


ae 4 
. sme be forme con 
group of studies seemed to be J mainly ; i 


in 
qualities, skills, knowledge, % ts 
of the individual leaders. D aling WEY h 
represented various studies de: yhysica S 
eminent men and heroes; (b) Pr jeadersi 
acteristics of leaders; (c) traits A pra ‘ 
job analysis; and (e) supervise ks 
American Schools. ‘The AE 
studies appeared to be aris ir 
formal-group centered appre an 
ings in this group in many ~oncepts 
influenced, generally, by the theory an 
logical» psychology or field fK tL 
ticularly, by the leadership © ved in 
These influences are aor d 
ature of: (a) sociometry; (b) ae 
(c) human relations; (d) na 
administration; and (e) cur" che 
While both of the above appren e 
valuable contributions to na a 
personal relations, they ee 
force of institutional expectat! facial 
which condition the roles © 


grong @) 
ate 


have 


ora vi 
per a 


sha 


! Nelson’s study, one of i f 
Harvester project, was aes 
rationale of the leadership 
leadership styles. š b 

? The uay of leadership has ob 
central, to the more genera 


Measuring Educational Leadership Attitudes 37 


These Chicago research studies have viewed 
the leadership process as mainly one of main- 
taining effective communication, and have tried 
to determine the internalized attitudes of vari- 
ous levels in the organizational hierarchy 
toward the social relationships at the “hard 
core” of the institution.2 The “hard core” is 
that point where institutional functions are 
translated into action, where organizations 
focus their attention. Since the role of the 
teacher represents the central position in the 
educational organization concerning which the 
Organizational agents (superintendents, school 
board members, principals, teachers, pupils, 
parents) may have varying concepts, it Is well 
Suited for such an analys 5. 

Ms study has attempted to measure the 


nes his role 
and 


tyles have 
A. Impersonal Style; 
Develop- 
r Coordi- 
cative ol 
s towar 

he school 
ed personnel, 
the commu- 
as of contact 


"ganizat 
Princip 
ni y é e 
with Superintendent) in certain are: 


ache wese people. They reveal, then, rie 
vay ie “philosophy of education, or 

\ defines his situation. 
teach The Impersonal Style represents, me 
z a who sees authority and expert opinion 
himself top of the hierarchy oA valle iey 
: as i that au y 
and ; the representative of get 


a Pupils of equal consideration 


cholo oat carn 
— . ; indi al and 
Societ — the relationship between the e Gir l 


anq 42,°% More particularly, between 

Profoun, Sroup. In both areas there is in pena 

individ, shift in methodology and attention ; 

any Ual to social interaction. erte 

lectin © term “hard core” has been described nae 

logy, $ of Prof. E. C. Hughes, Department o 
» ~niversity of Chicago. 


He receives a great feeling of security in de- 
pending upon the expert opinion and in follow- 
ing “the rules and regulations” of his position 
rather closely. He is inclined to be loyai, con- 
forming. The tone of his interaction is formal, 
marked by frequent one-way communications 
and infrequent two-way communications. 

B. The Personal Style represents the teacher 
who is a rugged individualist, technically- 
proficient, a good disciplinarian and hard 
worker. He receives a great deal of satisfac- 
tion from his own creative work, relies mainly 
on his own ability and knowledge. The tone 
of his interaction is less rigid than that of the 
Impersonal Style teacher. He maintains in- 
frequent two-way contacts, but his interaction 
is more personal. 

C. The Counseling or Developmental Style 
adds more to the qualities of the teacher. The 
teacher of this type is interested in social con- 
tact, in developing and guiding his pupils. He 
does this mainly through the use of individual 
incentives—praise, reward, friendliness. He is 
very much concerned about the background of 
each of his pupils so he may “guide” them. 
For this reason he is likely to use tests and 
measurements to a great extent. The tone of 
his interaction is much less formal than the 
Impersonal and Personal Styles although his 
methods of counseling are mostly ‘‘directive."’ 
He shows somewhat of a two-way interaction. 

D. Integrating and Coordinating Style. This 
stvle of behavior represents the other extreme 
of the continuum—the informal or group ap- 
proach. A teacher with this style tries to 
develop group standards, helps the group to 
express its own opinions. He conceives of his 
participation with the group as being that of 
a “catalytic” agent rather than that of an 
authority. The tone of his interaction is very 
informal with frequent, unstructured (and non- 
directive) two-way communications.‘ 


Methodology 


Since the attitudes of teachers and principals 
were to be studied in terms of the framework 
of the four leadership styles, it was necessary 
to select methods of evaluation appropriate 
to the needs of the investigation. Although 
possible methods of appraisal included pencil 
and paper tests, observation techniques, inter- 
views, questionnaires, collection of actual (cre- 


4 These four styles are identical to the four 
described by Dr. Charles W. Nelson, Industrial Roa 
tions Center, University of Chicago, in a paper ven 
at the 1950 American Psychological Association meet- 
ings, Pennsylvania State College. The terminology 
used in Nelson’s paper is A. The Bureaucratic-Regula- 
tive Concept; B. The Autocratic-Competitive Concept; 
C. The Idiocratic-Manipulative Concept; D. The Dem. 
ocratic-Integrative Concept. 


Shae id 
38 


ated) products of those being tested, ond 
records (especially anecdotal and behav lor 
recoids), it was decided that the needs of this 
investigation could best be served by the use 
of a self-administered inventory—a form of 
attitude questionnaire. 


The following hypotheses were set up: 


1. The teacher's or administrator's 
as measured by scores based o 
to the instrument indicate tha 


» and group the 
items according to the sty avior the 
scale was designed to reflect. 


SO constructed as to 
reveal individual differences in attitudes which 


can be measured and which overcome any 
stereotypes which may exist Concerning the 
role of the teacher, 


Evaluations of ther 


i (respondents) 
Scores on the scales, 


„5. (Null hypothesis.) There will be no sig- 
nificant relationship bi respondent's 
ial factors as 


tion, years of 
» Srade or level taught, subject 
taught, School System taught in, teaching load, 
average intellig, i 


aught, siz, 
class taught, BNE, size of 


„Socia! relationships— the 
of information 


4 Student teaching and gen- 
ks on Public, Private, 
m 


and educa- 
l na 
nterviews with Sement 


ai raduate Students 
Uestionnair distrib 
administr t an BEd h desers, 
Asa sult of s 
u ultat 
a following lationshj. had i ea 
aon aS represe tive one fi dyi ere 
role of the tea in th Vest : dying ae 
tion, 
ji Dealiz g with P. 


Valenti 
ing be- 
re n 
4. Rating, testing, and recording 
havior of pupil E mattes 
5 Handling routine classroom na ee 
6. Qualities of a good pup 
E f : f iev- 
teacher's demands. laiite (e1 
7. Handling mgr ronpi a 
l r ints). Pe. 
ances and complai le clierie 
8. Dealing with pupils oer eine 
9. Dealing with pupil cliq 


mal classroom groups. ed student 
10. Dealing with A E athe 
groups student organ il incentives 
Motivating pupils— pg tudes af 
Determining pupils’ a 
stimulating morale. 


N= 


er 
Pa d Oth 
Il. Dealing with the Principal an d 
Teachers. sher—teach ! 
13. Qualities of a good teac 


“tations: 5 

selection-teacher a 
14. Orientation of new teac 

tion. 

15. Rating of teachers. 

16. Improvement of 

service training. f 

17. Changing methods A 

adjustment to change. 


„g-in 
instructors 


Uee 
ion 
of instruct" 


nd 
HI. Dealing with Parents. « ses 
18. Handling parents’ s 


complaints. 


- A 
tions * 


. ay 
‘ ` ty. nit 
IV. Dealing with the Communit) commu 
19. Relationships with 
groups. Jeil the 
Superintendent , 
V. Dealing with the Superinte policies 
20. Rules, duties, and 


E 
p s. je" 
, manc b 
superintendent's dems pie 


See : n schoo 
21. Getting action Ons BB, ae 
~—teacher’s suggest O reaching l: 
22. Incentives to better 1 person” ing 
N di pa 
VI. Dealing with Other Employe hips de 
23. Colleague relations 


tho 
Rainy, ent) 
with the custodian MET, C oti 
' ad in tE she Or 

structec t cO 
Items were then constru . 


re of t'in 
j: indicative “re 
three areas which were KA wer 
Styles of leadership. The (which ‘nro 
Porated into the ag aba pee 
tively named “Opinion ing method pat Ot 
modified use of Thurstone’s hopes . dg 
comparisons (3, 8). It a manye 
guising” the scale eng Ponts Sent we 
Would avoid absolute judg great ra 
ents and minimize to a tualized in 4 
tendency toward “intellec ect score”? ye 
Wo methods of scoring (dir st? 
Scores) were established. set having < 
wo sets of judges (one k of thi ge 
Nowledge of the famewori know o 
the other set having no San t 
asked to classify the en i 
twenty selected judges 


. 


give” 


Measuring Educational Leadership Attitudes 


explanations of the framework of the con- 
{nuum and of the four leadership styles. 

hey were also each given ninety-two cards 
(twenty-three areas, four cards for each area) 
With the four alternatives in each area placed 
™ random order. 

Phe twenty judges including five University 
of Chicago faculty members, five research 
assistants, five graduate students, and five 
educators employed in the field, were asked 
to read the definitions carefully, and then to 
classify each item as A, B, C, or D in terms of 
the framework, 

_ In the second test, four judges (four Univer- 
sity of Chicago Professors) were given a similar 
Set of cards, each card containing one of the 
four alternatives. These judges were not given 
definitions of the four styles. In addition to 
being asked to describe the four styles they 
Were also requested to arrange the four patterns 
On a four point continuum and to state briefly 
What they thought the framework of the con- 
tinuum was. The object of both tests was to 
Correlate the testmaker's classifications of the 
items with the ratings of the two sets of judges, 
and also to scale the items somewhat in the 


39 


manner of Thurstone's method of equal appear- 


ing intervals. 
Results 


A preliminary form of the “Opinion Inven- 
tory” was administered to 73 subjects and a 
revised form to 515 teachers and administra- 
tors of 41 schools (14 school systems in Illinois, 
Indiana, Michigan, and New York). An 
attempt was made to select schools whose 
personnel were representative of those in 
typical city school situations. Table 1 in- 
dicates some of the characteristics of the sub- 
jects selected for this study. The two adminis- 
trations of the inventory revealed that: 


1. The instrument could, to a great extent, 
reveal individual differences in attitude scores, 
Fairly large ranges and standard deviations 
were obtained for the A, B, C, and D style 
distributions as is illustrated in Table 2 (hy- 
pothesis 3). 


Table 1 


Comparison of Some Personal and Social Characteristics of the 515 Res 


Educators in Typical C 


pondents with Those of 
ity School Situations 


ae i gai kani E = = PES 
En Characteristic Average 7 Range a Nat'l. Average 
Length of Experience 14 O30 ~~ 
Academic Training 
(@) Pre-Service Bachelor’s No Degree— Bachelor's 
Degree (57%) Doctor's (44.7%)* 
(b) In-Service—No of Graduate 
Courses in Last 3 Years Less than 1 PaT re 
Age 33 21-65 = 
Sex (% Male) 25% = iii 
leaching Load = . 1-8 
(a) Classes per Day a Belong ray 
a ?) Average Class Size one 15-45+ Saai 
i Averare Below— Average 
tightness of Pupils Average seers Average 
Grade Level si 
Elementary 53% K-6 56.7%** 
, Secondary . 47% 7-12 43.3% 
“sition Held . 
$ Uperintenden i 3% Teacher- 1%** 
Principal 7.5 Supt. 4 
tacher 89. 95 
PUlation of ' (2,500- 1,200- (2,500- 
Locality j $ 
uy 9,999) ¢ 8,000,000 9,999) ** 


* g 
k 48 State Scho 
i tennial Sur 
Represents most frequent category. 


i È il of State Governments, 1949, 
So ttn pds 4 Sialis of City School System. S, 1945—46. 
vey D 3 


J.J. Valenti 


Table 2 
A PRS ae eadership Attitude Scores 
Means, Standard Deviations, and Ranges of Teacher and Administ rator scianiabin ious uc e pie 
os ee ies 
; tin 
Impersonal Personal Counseling Te Saye 
“A” Style “B” Style “C” Style 2 — 
= = TIR i 45.5 
M 24.5 30.00 38.00 9.0 
S.D. 7.0 6.5 mn 
Range 8-49 11-52 AE 
N 515 15 p 
wW items wae oe (D 
2. The large standard deviations obtained and the testmaker’s items was +. 


and curve-fitting procedures showed that the 
distributions of the four leadership scores fit 
the normal curve rather tl 
U shaped curves w 


Chi square 
ces between 
curve) fre- 
istributions, 
he two dis- 


Scale values 
hophysical method of 
als followed the pre- 
(hypothesis 1) 
_ 4. The correlation of the test key with the 

of the four professors without defi- 


four Styles obtained a product- 
moment coefficient of +.90. The descriptions 


pothesis 2). sotto eet TOT 
5. An item analysis of the preliminary for 
of the inventory showed that the ie 5 

the item difficulties clustered around | lerable 
per cent level although there was r i 
range among them. A Kuder-Riché ie 

formula® for determining reliability =e An 
in conjunction with the item anean was 
“abac” chart developed by mangn data 
used to determine coefficients from S cents 
of the upper and lower twenty seven Pot ED 
of the distributions. Coefficients of = ie , 
57, and .88 were obtained for the er (88; 
C, and D scores, while coefficients of k A B, 
and .61 were obtained for the indore od 
and C scores. The indirect score met a ho 
this case is the term used for scoring ™ ons 

which counts both direct and indirect resp R) 
of subjects. Reliability coefficients rev 
for direct A, B, C, and D scores on the 


in 


ised 


€ The K-R formula used was: 20 ETTA 
n Sta, ae j 
--— Ep ripy FM- = 
tu = ae E Ne ee a 
The assutnption made in using this formula 


a 
al * 
ee 


the * 
rationale. The Single factor is being measured. The mat J. me, 
$ ` f as is explained in G, F. Kuder and M. W- jio 
a ents for the relationship between twenty ca tee Opes st of test relia” 
Judges with the knowledge of the rationale Psychometrika, 1937, 2, 151-160. 
Table 3 ted 
Per 
Chi Squares, p cin nae and Probability Levels in Comparing Observed F requencies and a 
urve) Frequencies for the Four Leadership Distributions _ a 
Impersonal g tograting 
sees ET Counting Tauera 
xe S cores 0 
8.79 14.3 
11. 
df. 11.00 an 11.62 13.00 
P ‘70 a 9.00 .30 A 


-30 


Meastiring Educational Leadership Attitudes 41 


w 
e 
T 


| 


HEIGHT OF ORDINATE (FREQUENCY) 


o 5 70 15 20 
are 
Fie. 1. 


form were respectively .79, .74, .65, and .87. 
Standard errors of measurement were com- 
puted for the four styles. For the first as- 
ministration they were +3.22, +3.33, +3.34, 
+3.21. For the revised test they were +3.21, 
3.32, +3.25, +3.24. 

6. As a check on the validity of the in- 
ventory, evaluations of the respondents were 
made by their colleagues and supervisors. A 
Sample of about one-fifth of the respondents 
(105 subjects) was selected for study. The 
Cases in this sample were made up of those 
Individuals whose leadership scores were one 
Standard deviation above the mean far either 
the A, B, C, or D styles. In each case, either 
* Superintendent, principal, or fellow teacher 
Sometimes both principal and superintendent) 
the ven the definitions and descriptions of 

Our styles, and asked to observe and eval- 


Table 4 


Correlations between A, B, C, and D Scores of 
z Teachers and Administrators 


— 
B @ D 

A 5 —.84 
29 —.56 : 

B 48 —.64 

c 14 


e 
25 


SCORES 


Histogram and normal curve comparing distribution of impersonal “A” style scores. 


uate the respondent in terms of which style of 
leadership he preferred. A product-moment 
correlation of +.59 was obtained between the 
subjects’ test scores and the ratings by col- 
leagues and supervisors. This would indicate 
a significant degree of relationship between 
the inventory scores and a criterion—the 
evaluation of leadership attitudes by colleagues 
and supervisors. However, it should be 
pointed out that this method of validation 
assumes that evaluations can be made reli- 
ably and in terms of a single index of classi- 
fication. Other methods of validation for 
further study are recommended below.é 

7. A search, through analysis of variance 
for certain personal and social correlates of 
leadership attitudes showed few significant 
relationships. Table 5 points out the F ratios 
obtained. Age and experience partially affect 
attitudes, the younger and less experienced 
the person the more integrative are his atti- 
tudes; the older and more experienced the more 
formal and impersonal they are. Amount of 
academic training as evidenced by the posses- 


®In a study of foremen in industry, Nel 
product-moment coefficient of hee een ee 
men’s attitude scores and analyses of their Rorschach 
and T A T’s by clinical psychologists. In recent month: 
Nelson has been finding significant relationships b : 
tween leadership attitudes and Szondi test re 


42 


J. J. Valenti 


Table 5 


F Ratios among A, B, C, and D Scores for Cer 


ea Chaadar ot 
tain Personal and Social Characterist 


Teachers and Administrators —_ —— 
= 7 s Integraun 
Personal Counseling “p” Scores 
Total Impersonal ers g “OC” Scores a 
eas af. “A” Scores “B” Scores : 2.17 
teristic be ne 
Charac n ET 1.73 1.04 92 
T Oo 2 Bo mo @ 
S “ 5 
Saker 512 4.01 1.62 To 105 
ositi q ne 3: 4.05 
Experience 512 a iat 1.94 1.43 
Pre-Service Training 512 os 5 1.70 1.43 1.41 
In-Service Training 512 ee ie L49 155 
Subject Taught ala au} 6a 1.13 1.31 
No. of Classes 512 acl a. ii in f ; 
Class Size 512 2.10 A 147 6,90" 
Brightness of Pupils 512 1.11 ` t0 129 
Grade Level 450 1.21 1.06 AT “Ss 
School System 512 1.22 1.54 i 


* Significant at the five per cent level, 
** Significant at the one per cent leve], 
z Significantly homogeneous, 


grees, or by the recent 
te courses 


Important 
eterminant Seems to be the individual school 
Situation, Į appears that the uman rela- 
tionships within e 


Implications and Recommendations 


las been of value since the in- 


: ent developed 
: : 

s ; Jectifying and 
measuring Social Interaction itself rather than 
interaction Inferred f t characteristics 
of Individuals involve > Tucture of th 
organization, or the techni Sed. To rake 
Fa p Opinion Inventory» rther value t 

€ field of ¢ ucation certain 
ai “commendations 


ho” 
le of the py 
1. A study should be age a 
logical factors in the four E the tea 
Projective tests administerec Jinical PSY per 
Studied and interviews by dafinë he 
ogists can make further ue types of 
havior expected from the eE y a in 
study of personality wouk adjustme! 
assistance in bringing about ad) ; col 
social relationships. ive study % 30% 
2. A more comprehensive hould be 
relates in leadership dimin ‘factors a 
The investigation of socia than sups 1 
report has been little a e looked ‘dh? 
More subtle factors should i and h 
status, “parental geen tk a ii 
experiences, interests, m aln 
be evident that more in to obtain 
investigation are needed a) 
formation, validity 
3. Further checks for 1 be ma et’ 
reality of the four styles eect jnt peb 
range, intensive, non-dir ilor 
and by continuous observ 
by participant observers. se int 
4. The instrument can 


oring: 
(a) A better method of ingle ose 
that has been made of a ne easi 
each leadership style R justi 
on statistical grounds. 


on 


roved , 


Measuring Educational Leadership Attitudes 43 


a single score, positive intercorrelations among 
all the items are needed. This would involve 
the computation of some 10,000 correlations. 
Since this is a rather tedious task, the use of 
short cuts such as Thurstone’s Edgemarking 
method should be sought. 

(b) If the above intercorrelations are cal- 
culated, they can be of use to further research 
Mm the form of a factor analysis. 

(c) Scaling of the items is suggested. Some 
device to determine psychological distance 
between choices may be used such as Thur- 
Stone’s law of comparative judgment (case V) 
or Guttman’s scaling method. 

(d) A reduction in the length of the inven- 
tory is in order with the suggestions of many 
of the respondents. This may be achieved 
by making alternative forms of the 138 items 
through two scales of 69 items each, and by 
the use of some other psychophysical method. 
Although paired comparison methods are very 
Subtle and make for substantial internal con- 
Sistency, they are reported to be very boring 
for the subjects responding. Another sugges- 
tion for reducing the length of the inventory 
is to eliminate the two least discriminating 
Pairs of responses from each of the 23 areas. 
This would result in a scale of 92 items. 


5. Research is needed to determine which 
Styles are needed for each leadership situation. 
lhe casual reader might imply from the ra- 
tionale that the integrating “D” style in- 
dividual is the preferable one. This is not 
the intention. If the leadership needs stem 
rom the demands of the situation, it would 
Probably be very difficult for a group long 
*customed to formal and structured Te- 
tionships to adjust itself to an integrating 
Sti le leader. We must determine how effec- 
ive the four leadership styles are in various 
Pes of situations. 
< Once the style of leadership needed has 
fen determined, there remains the matter of 
at ging and modifying individual and group 
. titudes, Research in modifying leadership 
morass, in helping people to redefine their 
es, is necessary, 


7. A form of the “Opinion Inventory” 
should be administered to the several levels 
of the hierarchy in the educational organiza- 
tion. By observing the responses of teachers, 
pupils, parents, the principal, the superintend- 
ent, community groups, and other employed 
personnel in a local situation, we may better 
understand the nature of the network of in- 
formal organization. 

In the further administration of the “Opin- 
ion Inventory,” its use as an index for pro- 
motion, hiring, firing, rating, and salary 
scaling is definitely nof recommended. Such 
actions would probably destroy the good 
rapport needed with subjects and render the 
instrument useless. 

8. The framework used in this study may 
be used in other occupations. It has already 
been used for foremen in industry, and for the 
dealer-customer relationship. It can prob- 
ably be used in studying the communication 
Process at the hard core of any social insti- 
tution. It is believed that the eight recom- 
mendations above can help the instrument 
make a greater contribution to educational 
research. 


Received May 3, 1951. 


References 


1. Allport, G. W. Attitudes; A handbook of social psy- 
chology. Ed., Carl Murchison, Worcester, Massa- 
chusetts: Clark University Press, 1935, 

2. Flanagan, J. C. General considerations in the selec- 
tion of test items and a short method of esti- 
mating the product-moment coefficient from the 


data at the tails of the distribution. J, educ. 
Psychol., 1939, 30, 674-680. 
3. Guilford, J. P. Psychometric methods. New York: 


McGraw-Hill Book Company, 1936, 

4. Kuder, G. F., and Richardson, M. W. The theory 
of the estimation of test reliability. Psycho- 
melrika, 1937, 2, 151-160. j 

5. McNemar,Q. Opinion-attitude methodology. Psy- 
chol. Bull., 1946, 43, 289-374, ; ý 

6. Nelson, C. W. Development and evaluation of a 
leadership attitude scale for foremen. Unpub- 
lished Ph.D. dissertation, Department of Soci- 
ology, Hiiversity of Chicago, 1949, 

7. Thurstone, L. L. Elements of test theory. New 
York: John Wiley and Sons, Inc. A 

8. Thurstone, L. L. Three psychophysical laws, Psy- 
chol. Rev., 1927, 34, 424-432. ý 


Verbalization and Learning a Manipulative Task 


C. H. Lawshe and William Cary 


Occupational Research Center, Purdue University 


The growing recognition of the importance 
of industrial employee training has been ac- 
companied by a search for those principles 
of training which will maximize the effective- 
ness of job instruction. One such principle 
which has received considerable attention in 
recent years is that of encouraging the learner 
to repeat to the instructor those th 
he has learned. The implication of this 
process is that it will show those gaps in the 
Process where the learner did not absorb the 


presented ideas, and that it will also fix the 
process in his mind. 


This study was designed to determine the 
effect of ve 


tbalization on the number of trials 
required to learn a manipulative task. In 
addition, the effect of verbalization on a 


learner’s ability to conform to a prescribed 
procedure was investigated. 


ings which 


Procedure 


Method. The method utilize 


A d tasks consist- 
Ing of sub-tests A— 


2 and A-4 from the Purdue 
These tests! 


timed trials were 
the Standardized 
corded as orrors, 


given. Al 


in each sub-test w i 
of the taskẸrevealed that 
minimize the actual time} 


m 
r required for 
altering the nature of the fk, a 


into 
only, individuals were matched and eae, 
either a control group or an experim me 
group. Next, members of each oo pro: 
taught to assemble sub-test A—4. j ‘val 
cedure followed with the control group a 
identical to that just described. BP in 
perimental group procedure differed e ii 
that members were required to nae’, 
their instructions to the trainer, ien bu a a 
lize as they assembled. Each subjec t we 
many trials as necessary to attgin an a Pad 
time standard; all trials were timed anc : 
certain errors were not reflected in time s€ 
they were also recorded. 


Results 


ata. a 
Following the collection of shese Se 
Statistical analysis was made of the di y ] an 
of performance between the expec er” 
control groups. A comparison of the tion 
ences between means and standard want 
of the control and experimental groups 0 


: yas 
i karioi i 
number of trials to reach the te a 
investigated, Such a comparison een t 
likely to reveal any differences betw 


groups which would tend to be ogee 
the trial criterion, The procedure an, y 
was the orthodox method used in Meme 
ences between two small matched promn ma 
In summarizing these statistical < mean 
was found that the “t” ratios of the ot the 
on the number of trials required to Me first 


Ta he 
criterion and the a 


verage time on tively: 
three trials were -34 and .61, respect he 


These small values of “t” indicate baa = 
obtained differences of performance berred 
the two groups could readily have e of 
by chance alone. Similarly, critical — } 
the standard deviations are not large at 
to permit refutation of the hypothesi up? 
there is no true difference between the s9 
in Variability of performance. It Ep G 
observed that the critical ratio for the DY" ppe 
of trials to meet the criterion is .81 2” 


Verbalization and Learning a Manipulative Task 45 


critical ratio for the average time on the first 
three trials is 1.38. 

Although the latter critical ratio is non- 
significant according to accepted standards, 
it does suggest that verbalization may exert 
a differential effect on the “fast” and “slow” 
performers. In order to investigate this 
Possibility further, each group of subjects was 
divided into two sub-groups: (1) those whose 
time scores on the Matching Phase sub-test 
(sub-test A-2) fell above the mean; and (2) 
those whose time scores fell below the mean 
on this sub-test. Those groups whose time 
scores fell above the mean, i.e., required more 
time, are hereafter referred to as the “high” 
groups; those whose time scores fell below the 
mean are referred to as the “low” groups. 

After this sub-division, the differences of 
means and standard deviations between the 
“high” groups and the “low” groups for both 
measures of performance on sub-test A—4 were 
investigated. All differences were too small 
to warrant rejection of the hypothesis that 
there was no true difference of performance 
between the groups. If the 15 per cent level 
of confidence is established as indicating sig- 
nificance, all differences of means and standard 
deviations are readily attributable to chance 
factors. 

In view of these low probabilities of a true 
difference beyond zero existing between the 
groups it can be reasonably concluded that 
the experimental group required as many 
trials to meet a performance criterion and 
Spent as much time on their first three trials 
as the control group. 

It will be recalled that certain part in sub- 
test A-4 could be positioned in a different 
Sequence from that prescribed by the instructor 
Without being reflected in the time score. The 
ifferences of means and standard deviations 
tween the experimental and control groups 
On the number of such errors committed were 
ee for statistical significance. Only 

Se errors committed on the first three trials 
of sub-test A-4 were included in the analysis 
“ince all subjects performed the task a mini- 
es of three times, but a varying number of 

S were required to reach the criterion. As 
el aad errors was a function of the 
er of trials on the task, such a comparison 


gave each subject an equal opportunity to 
commit an error. 

It was observed that the “t” ratio of the 
mean differences between the experimental 
and the control groups is 1.20 and the “t” ratios 
of the “high” groups and the “low” groups 
are 1.20 and .56, respectively. These differ- 
ences are not significant at the 20 per cent 
level of confidence. Hence, it is concluded 
that the experimental group did not make 
significantly fewer errors than the control 
group. 

Summary and Conclusions 


The purpose of this experiment was to deter- 
mine the effect of verbalization on the number 
of trials required to learn a manipulative task, 
Fifty-two students at Purdue University were 
divided into two matched groups and individ- 
ually taught to assemble a manipulative task. 
After this instruction, the experimental group 
verbally described each operation of the task 
back to the instructor as they performed it 
and the control group performed the task 
without verbalizing. The groups then re- 
petitively performed the task until they 
reached a pre-established criterion of learning. 
Differences of means and standard deviations 
between the control and experimental groups 
on the number of trials to reach the criterion 
and the number of errors and average time on 
the first three trials were analyzed for their 
significance. None of these differences were 
found to be significant. Each group was then 
divided into two sub-groups and the differ- 
ences between the means and standard de- 
viations of performance were analyzed. Again 
it was found that all differences were so smal] 
that they could be readily attributed to chance 
variations. 

On the basis of these results, the followine 
conclusions were drawn: š 


1. The experimental group which verbalized 
or “talked back” their instructions to the 
instructor required as many trials to meet 
a performance criterion and spent as much 
time on their first three trials as did the control 
group. 

2. Similarly, the experimental group did 
not make significantly fewer errors on the 
first three trials than the control group. 


46 C. H. Lawshe and William Cary 


3. The high experimental group whose time 
scores fell above the mean on the initial task 
required as many trials to meet a performance 
criterion and spent as much time on their first 
three trials as did the high control group. 


d out that subjects 
hile they per- 


ey verbalized only 


on one trial. Further studies in this area 


might reveal that verbal description witho 
performance and/or verbalization pale fe 
greater number of trials exerts a poe 
effect on the ability of a learner to perform 

task. 


Received April 4, 1951. 


References Pr 
sation alt 
t. Cary, William, Jr. The effect of ser ee lta 
number of trials required to learn a at: niv- 
task. Unpublished master’s thesis, Pure 
1950 i 
a ` idalio 
2. Graney, M. R. The construction and men 
new type of mechanical assembly od? 
lished Ph.D, thesis, Purdue Univ., 


nof g 
npud 


A Punched Card Procedure for Use with the Method of 
Paired Comparisons 


N. C. Kephart and James E. Oliver 


Division of Education and Applied Psychology, Purdue University 


Problems involving the method of paired 
comparisons require the preparation of in- 
dividual slips with one pair of items on each 
slip. It is desirable that the names be paired 
on separate slips, as opposed to mere presenta- 
tion of two lists of the names so that time and 
space errors may be controlled. (Time and 
Space errors relate to the order of presentation 
of pairs and to the relative position of members 
of the pairs, respectively.) Such a slip is 
required for each of the possible pairs of items 
in the study, the number of such pairs being 
equal to N(N-1)/2. The labor involved in 
the preparation and scoring of these pairs has 
been an often repeated adverse criticism of 
the method. 

The materials necessary for such problems 
can be prepared and scored on punched card 
equipment. Control of the relative position 
of members of pairs and the order of presenta- 
tion of pairs is provided. In addition, con- 
siderable saving is made with respect to the 
time required to make and score the pairs. 
For example, an experienced operator can 
produce a deck of 300 pairs (25 names in 
variable list) by punched card methods in 
approximately 30 minutes. Subsequent scor- 
ing would require approximately 10 minutes. 
If a typewriter were used to prepare the pairs, 
12 to 16 hours of clerical labor would be re- 
quired in preparation and scoring. 

An example of the procedure for setting up 
paired comparison materials applied to ratings 
of workers on job performance is given below. 
This procedure is systematic and can be di- 
rectly applied to any other problem involving 
the paired comparison technique. 


Preparation of the Pairs 


1, Assign each worker a serial number from 
to N. The method of assigning numbers 
makes no difference to the results. Any order 
of the original names can be used but no serial 
number can be used more than once. Punch 


47 


the serial number in columns 1 and 2 and the 
names in columns 5 to 25, punching one card 
for each name. Call this master deck 1. 

2. Reproduce the cards from step 1, chang- 
ing the serial number to columns 3 and 4 and 
the name to columns 40 to 60. Write the 
serial number (punched in each card) on the 
back of each card for the first one-half of the 
cards in the deck. Call this master deck 2. 

3. Reproduce the cards from step 1. (Be 
sure that the serial numbers of these cards are 
in consecutive order.) Call this set / and write 
a “1” on the back of the last card reproduced, 
i.e., label the set. 

4. Reproduce the cards from step 1 again. 
Call this set 2 and write a “2” on the back of 
the last card in this deck. Lay set 2 adjacent 
(not on top of) set 1. 

5. Continue reproducing and numbering 
each set of cards from step 1 (set 3, set 4, etc.) 
each time writing the consecutive number of 
each set until the appropriate number of sets 
has been reproduced. If the number of 
names being paired is odd, (N-1)/2 sets are 
needed. If the number of names bein 
is even, N/2 sets are needed. 

6. The Serial numbers on the cards in each 
set are in consecutive order. Be sure the 
cards from step 2 (master deck 2) are in con- 
secutive order. 

7. Reproduce deck 2 
been prepared. 


g paired 


into each set that has 


a. Before deck 2 is re 
remove the number 1 card from the bottom of 
deck 2 and place it on top. 

b. Before deck 2 js reproduced into set 2, 
remove the number 2 ¢ 


ard from the bottom of 
deck 2 and place it on top. 


c. Before deck 2 js reproduced into set 3, 
remove the number 3 card from the bottom of 
deck 2 and place it on top. 


d. Continue this procedure until all sets 
have been reproduced. 


produced into set 1, 


48 


cant sones 


Ss 


Fic. 1. Card on which judge’s choice is recorded. 

8. If an odd number o 
paired, the appropriate nu 
be made after the last set h 
from step 7. 


9. After pairs are all prepared, reproduce 
random numbers in columns 76-78 from a 
previously prepared deck of random numbers. 

10. Sort cards progressively from 76-78. 

11. Interpret names on cards; do not in- 
terpret serial numbers. Leave considerable 
spacing between the two names, See Figure 1. 

12. If a judge rates on more than one factor, 
each independently, or if- Several judges are 
rating, more than one deck of paired names 
is needed. After step 10 above, reproduce 
and interpret as many additional decks as 
be required. As new decks 


pairs will be in Correct order 


f names is being 
mber of pairs will 
as been reproduced 


may 
are reproduced, 
for presentation. 


go in reproducer in this order 
toduced into each Set), 


Example: 7 names (cards 
when deck 2 is rep 
Set 1: 7 6 


2 2 1 
Deck 2: 1 7 6 5 4 3 2 
Set 2: 7 6 5 4 3 2 1 
Deck 2: 2 1 M 4 3 
Set 3: he BOLEE Ad 3 2 1 
Deck2: 3 4g 1 i 68s a 
6 
so = 21 pairs 


N.C. Kephari and James E. Oliver 


13. When the number of variables is even, 
one-half of the last set in step 7 must be de- 
stroyed as the pairs on the first and last halves 
are the same except that the order within 
pairs is reversed. Destroy one-half of the 


last set reproduced from step 7 and proceed 
with steps 9 to 12, 


Example: 6 names 


Set 1: 6 5 4 3 2 

Deck 2: E G $ 3 2 
Set 2: E e g@ @ @ 
Deck 2: 2 1 6 5 4 3 
Set 3: 6 5 4 ww # 
Deck 2: $3 2 1 £ x 

Destroy 
26) = 15 pairs 


Scoring of the Pairs 


The judge then makes his choice Deni 
the two members of each pair and indicate 
this choice by a pencil mark. s 

14. Sort the cards by hand into two i a 
Those in which the left-hand member of th 
pair is checked and those in which the right- 
hand member is checked. 2 

15. Sort progressively on columns 1 and 4 
of the left-hand preferred group and with the 
tabulator (controlling on serial number) coun! 
the number of choices for each serial number: 

16. Repeat step 15 for the right-hand p 
ferred group, sorting and controlling on CO 
umns 3 and 4, g 

17. By manually adding the card comi 
obtained for each worker (each serial number 
on the right and left (steps 15 and 16) the 
total number of choices is obtained. 

Received April 4, 1951. 


Errors of Interpolation in Instrument Reading and Setting 


Charles Martin Levett, Jr. 


Lehigh University 


See portance al linear interpolation is 
en in eens situations which require 
Re e readings on instruments or dials. 
are applic Saupe can be established which 
Possible i e to all human beings, it is quite 
meth hat further research will bring forth 
ods of overcoming errors in dial or in- 


Stry 3 
ment reading to a greater degree than is 


Possible at present, 
Í 


rey 

linear Phy work connected with the study © 
Millen averpolation has been described by 
k lns 1s therefore omitted here. 

interpo D of linear interpolation involved 
(Imm 0? in tenths, of five interval sizes 
ssults sh mm., 3 mm., 5 mm., and 10 mm.). 
be owed large individual differences to 


~ Of more ; : ; 
biases re importance than interval sizes OF 


The Purpose 
the ts of this experiment was to study 
Mt sity acy of interpolation in three differ- 
Set p, „tions: (1) reading from a slide rule 
"ule y i experimenter, (2) setting of the slide 
carg the subject, and (3) the 10 mm, Miller 
1S also the purpose of this investi- 


© find out if the Miller cards can be 
lieu 0 


e 
arrangement. 


Procedure 


Thirty subjects were 
entas t9 68. Most of. 
er of Suh: Lehigh Universit 
as 1bjects from various one 
cp Well. The students were rep 
Serj all three curricula: liberal arts, 
tagy inant wd business administration. 


at numb z chology under- 
e er were psycho h of the 
me 


used ranging 
the subjects 
yi however, 
occupations 


n. 
Sed, were 
tag ach d 
luge! of 544 on th ix ti 
e ¢ times, ! A 

a lne readings on met oe The various 
heii are arranged in random order. 
lation” K., Jr., An exploratory uty 0. 


` appl. Psychol., 1950, 94 y 


A slide rule without scale divisi 
bers was set up in the jallowies maena E 
perpendicular lines 10 mm. apart we id rn 
on the stationary part of the slide ru le ‘hile a 
third perpendicular line was draw: eo 
movable section. A mirror was nepati 
the movable section. An Army Air Forse e 
gun was used as a light source. The light 
from this source was reflected by the rive 
on to a scale 1825 mm. away. The scale used 
was 690 mm. long, and each interval on the 
scale was between 67 and 73 mm. long, Con- 
sequently a movement of one mm, on the slide 
rule resulted in an approximate movement of 


70 mm. on the scale. 
The scale was so calibrated and adjusted 
that zero on the stationary slide rule corre- 


sponded to zero on the magnified scale, and ten 
on the stationary slide rule corresponded to 
ten on the magnified scale. 

The movable part of the slide rule was 
manipulated by means of a wheel attached to 
a long threaded bolt which fitted into a nut 


attached to the slide rule. 
A partition was placed between the subject 


and the scale to prevent the subject from 

observing his results as the experiment pro- 

gressed. However it did not block the experi- 
enter’s view. 

m An overhead fluorescent fixture and a fluo- 
5 ere used for illumination. 


escent desk lamp W 
Care was taken to make sure that no shadow 
he slide rule. The slide rule was 


ient intervals to prevent subjects 
ecks as reference points. 

a Experiment. The 
oe ment consisted of three separate activi- 
Sotie subject was first required to inter- 
olate the 54 settings on one 10 mm. Miller 
p Each subject was allowed to interpolate 
e use of any device which would 
king back at interpolations already 
Subjects could, go back and „change 
; ;dgments if they desired. The subject re- 
J ; nts on a mimeographed 
he blanks numbered to 


wiped at freq! 


usin 
iron Three 


orrespon? , 
No time limit was P 
In the second per $ 
„ orimenter set the mov: 
experim ints between the two perpen- 


various pol 
rule z ue ne stationary part of the rule, 
dicu T treme care in placing the reference line 
using S tenths. After each judgment the ex- 
P erimenter moved the rule back and forth 


50 


5 imes before putting it in position for 
ee Each tenth was presented 
six times in random order according to a pre- 
arranged key, making a total of 54 judgments. 

The third part of the experiment required 
the subject to set the rule at specified tenths 
between the two lines. After each judgment 
the experimenter moved the rule back and 
forth several times before the next number was 
presented. 

A trial consisted of one Miller card (54 read- 
ings), 54 readings from the slide rule, and 54 
settings by the subject. After an hour of 
preliminary practice each subject completed 
six trials divided into two sessions. This re- 
sulted in a total of 324 readings on each of the 
three parts. 


In none of these procedures was the subject 
informed of his errors. 


Charles Martin Levelt, Jr. 


Results 


In Table 1 are shown errors subject-by- 
subject. The column headed “Slide Rule 
Errors” indicates the number of errors mM 
reading the slide rule set by the experimenter. 
It can be seen that there are large individual 
differences, since errors range from 0 to 87 
out of 324 judgments. Occupation, age, and 
sex apparently are not determining factors. 

The column headed “Mean Discrepancy 
Setting” in Table 1 indicates the average 
amount by which subjects missed the exact 
positions when setting the slide to assigned 
numbers. The discrepancies ranged from @ 
low of .085 to a high of .421. 


Table 1 


Errors of Subjects 


Se 
Mean “ler 
Slide Discrep- mils 
) ; Rule ancy in Errors, Errors 
Subject Occupation Age Sex Errors Setting Over a! ee 
G.D. Psych. Grad. 22 F 0 085 2 : 
E. D. Ind. Eng. Undergrad. 25 M 0 141 8 0 
W.J. Met. Eng. Undergrad. 22 M 0 .168 6 4 
Beige Psych. Undergrad. 22 M 1 131 0 4 
E. H. Mech. Eng. Undergrad. 20 M 1 -228 12 7 
J.D. Secretary 18 F 3 162 7 4 
B.G. Psych. Undergrad. 23 M 3 -206 16 ; 3 
H.L. Chem. Eng. Undergrad. 18 M 4 -204 13 n 
M.F. Psych. Undergrad. 19 M 6 254 34 T 
P.F. “Elect. Eng. Undergrad. 26 M 7 .241 14 : 
A.S. Pre-Dental 19 2 a 
3 M 7 246 28 3 
«G. Schoo! Teacher 28 F 7 238 19 i 
L.L. School Teacher 3 ? 
fest ee 64 T 8 .263 33 2 
oP. lect. Eng. Undergrad. 18 1 
W.H. Psych. Undergrad. | a E a 18 
iL. ge eee i . 0 M 9 .237 10 4 
e. ra Undergrad. 26 M 9 284 35 3 
istory Undergrad. 19 M 21 
D. B. Bus. Ad. Undergrad. 21 7 it n = 3 
JR. Psych. Undergrad. 25 M 16 273 29 a 
jS Psych. Grad. = M 17 145 11 9 
R- K Salesman 30 v 20 -284 43 73 
RW. Psych. Undergrad, 20 $ 21 -242 26 21 
W.R. History Undergrad, mt 2i 251 x 4 
B. B. Research Assistant 23 M 23 247 2 19 
T.L. Secretary i F 26 .281 29 71 
J.T. Store Clerk x F 30 229 17 42 
S. M. Psych. Undergrad. 26 N 45 312 a 14 
R. if Psych. Undergrad. 19 M 54 -255 20 59 
a . Psych. Undergrad. 19 M 56 -346 83 34 
L. Stock Broker a m 67 21 104 11 
—e 1 87 


oe 279 3o 


Errors of Interpolation in Instrument Reading and Seiling 51 


Table 2 
= Number of Errors at Each Position 
EE $ 
Totals 5 > 

ee 2 1 2 3 4 5 6 7 8 9 

ide Rule Readings ~ 36g 5 37 7 

A a gS 4 95 2. 5 7 
Pe Rule Stns y i í ği j $ $ in 7 
ana Over (5 8 114 139 5 3 112 174 102 2 
“a t Cards 4 32 117 3 80 152 202 14 6 


to ae Son om headed “Errors over 5” refers 
an .5 he der of errors ofa magnitude greater 
Eoliimn ie hen subjects set the slide rule. This 
M readin comparable to “Slide Rule Errors 
n most a However the numbers are larger 
Parently cases, ranging from 0 to 104. Ap- 
Ctrors in most subjects make more gross 
reading tl setting the slide than they do in 
jc ee experimenter’s settings. n 
and pet Column in Table 1 marked * Miller 
Y each = ’ shows the number of errors made 
Tors ee on the 10 mm. Miller cards. 
tge indi Be from 0 to 77, again indicating 
ti 1y idual differences. 
Columns poring all four of the above mentioned 
ad the e can be seen that the subjects who 
Made ve Owest number of slide rule errors also 
XPeriny ty few errors in other aspects of the 
Shoy Ut The rest of the subjects do not 
Made “uch consistency, Thus subjects who 
readin A large number of errors in slide rule 
for «85 did not necessarily show large errors 
mer oo Discrepancy in Setting,” “Errors 
; “Slide or the “Miller Cards.” ri 
Rule Errors” correlate 62 with 
iscrepancy in Setting,” 61 with 
r “Miller Card 


dicates that 
e not 


la 


Rance 
sors.» Over .5” and .37 with | 
© two. s¢ his last correlation 1n 

ilar emingly similar situations & ft 

Use tio as might appear on superficia 4 
Seq yo Miller cards cannot be adequately 
& substitute for slide rule readings- 


Table 2 is concerned with errors in relation 
to position on the scale and includes all sub- 
jects concerned. Errors in reading the slide 
rule, in setting the slide rule, and in reading 
Miller cards all show a sharp dip at position 5. 
Errors in setting the slide rule and in reading 
Miller cards are low at positions 1 and 9, but 
this does not hold true for the same positions 
in reading the slide rule. This re-emphasizes 
the fact that reading a slide rule and reading 
Miller cards are not equivalent processes. 

Table 3 is concerned with the general nature 
of biases. Although biases vary somewhat 
from subject to subject some general trends 


can be noted. 
The biases S$ 
indicate that S 


hown for “Slide Rule Readings” 
ubjects tend to use the end 
lines and an imaginary line at position 5 as 
points of reference. For example, at position 
6 there is a strong preponderance of plus errors. 
That is, subjects tend to read an actual 6 asa 7, 
as if they thought of 6 as lying closer to the 
imaginary center line than it does. In making 
settings at position 6 the subjects tend to set 
closer to the center, resulting in a minus mean 
discrepancy: The bias in reading and the 
pias in setting are therefore entirely consistent, 
There seems to be a definite tendency on 
the part of the subjects to think of 1, 2, 8, and 
9 as closer to their respective end lines and 
to think of 3, 4, 6, and 7 as closer to the imagi- 
nary center line than is actually the case. 


Table 3 


>= Prep 
=. Tt 
wide R S 1 2 3 
Mea, ule Reng: TA 
MAE Sega dings +39 + og 
"Carda aig =B T 
44 +30 — 


derance of Error at Each Position 
on 


4 a 6 7 8 9 
—54 —11 +13 +7 —67 ET 
23 +.08 —.16 —.12 +.10 4.10 
—126 —10 +148 +170 =o =F 


52 Charles Marlin Levelt, Jr. 


Miller cards show biases which are similar 
to those found in “Slide Rule Errors,” although 
the degree of bias is greater at some positions 
and less at others. 


Summary 


Thirty subjects of both sexes and of various 
ages and occupations were required to make 
interpolations between marks 10 mm. apart 
in three different situations: (1) reading from 
a slide rule set by experimenter to exact tenths, 
(2) setting the rule to tenths, and (3) reading 
tenths from Miller cards. Each subject made 
324 interpolations in each of the three situa- 
tions. 

Results showed large individual differences 
among subjects. Slide rule reading ` errors 


range from 0 to 87. Slide rule setting errors 
in excess of half a unit ranged from 0 to 104. 
Errors in reading Miller cards ranged from 0 
to 77. Correlations between slide rule read- 
ings and the mean discrepancy in setting was 
.62, between slide rule readings and errors over 
.5 was .61, and between slide rule readings and 
Miller card readings was .37. 

In all three methods errors were made less 
frequently at position 5 than at any — 

Readings at positions 1, 2, 8, and 9 showec 
an inward bias, possibly due to the use of the 
end lines as reference points. Readings a 
positions 3, 4, 6, and 7 showed an outware 
bias, possibly due to the use of an imaginary 
line at the center as a reference point. 


Received February 26, 1951. 


A Note on “Simplification of Flesch Reading Ease Formula” 


George R. Klare 


The Psychological Corporation, New York City 


fn oe article (3); Farr, Jenkins and 
Flesch p rore a simplification of the 
implication ing Ease formula (4). In the 
factor ation, Flesch’s average sentence length 
one-syllable wo unchanged, but number of 
factor e words is substituted for Flesch’s 
The af inpe of syllables per 100 words. 
feel ee for this change is that the authors 
siti analysts often have difficulty in 
syllabic thg number of syllables in poly- 
would AEL, that the “simpler method 
ie Maly be much faster and would 
Part of T knowledge of syllabification on the 
he ang he analyst. It would merely require 
alyst to recognize and count the number 


Th syllable words.” 

l T little doubt that many ® 
În recog ly inexperienced ones, have difficulty 
Perts eae syllables. Even language ex- 
Writer S syllables hard to define (5). This 
Ever ould like to raise several questions, 
or comin the two advantages claimed 
all he ing one-syllable words over counting 

is Syllables in words. 
“quire Stated that the simpler method would 
Part do knowledge of syllabification On the 
0 kno ne analyst. How can an analyst with 
‘ele wledge of syllabification recognize and 
Ne-syllable words from the mixture 0 


n 

4 One-plus syllable words found in 

Compare “piano 
Com- 


analysts, 


how, 


Th emi; 
lee’ iar to most fourth grade pupils \ 
Ss 5 
amili 
im words as examples: 
a E very nearly as much knowledge 
h di ion is required, OF 
Sas iy ulty met, in selecting OM 
ù counting all the syllab 


a 


53 


—unless, of course, the analyst i 
one-syllable words. Since sich ois ; r 
have to include several thousand words OO), 
speed of application would be sacrificed oh 
The second question, then, is whether or 
not the simpler method would, as stat ‘i 
“obviously be much faster.” Since Table 5 Í 
the authors’ article indicates that most ariin 
contains a majority of one-syllable words it 
seems probable that about the same maoniy 
of the remaining polysyllabic words should 
contain but two syllables. An analyst with 
knowledge of syllabification would find 
relatively few many-syllabled words to count. 
Would a significant amount of time be saved 
by the proposed method over the method of 


“reading silently aloud” and counting all 


syllables? 


A third and related question also arises. 


Would not each counting error be magnified 
and reliability decreased, by the new methods 
Since writing contains more syllables than 
one-syllable words, this would seem to be the 
case unless one could be assured that the 
analyst would make several syllable errors 


to each word error. 


mber 26, 1951. 


Received Nove 
turn by the editor. 


Published out of 


References 


Chall, Jeanne S. A formula for pre- 
dicting readability: instructions. Educ. Res. 
Bull. (Ohio St. Univ.), 1948, 27, 37-54. 

. Dewey, G. Relative frequency of English speech 
sounds. Cambridge, Mass.: Harvard Univ. 


Press, 1923. 
. Fart, J. No Jenkins, 


Dale, E., and 


w 


EJs and Paterson, D. G. 
Simplification of Flesch reading ease formula. 
J. appl. Psychol., 1951, 35, 333-337. sv 
Flesch, R- F. Anew readability yardstick. J. appl. 
Psychol., 1948, 32, 221-223. 
Gray, L H. The foundations of language. New . 
York: The Macmillan Co., 1939, 

6. Miller, G. A. Language and communication. New 
i -Hill Book Co., 1951. 


York : McGraw: 


Y 


Reply to “Simplification of Flesch Reading Ease Formula” 


Rudolf Flesch 


Dobbs Ferry, 


In their paper “Simplification of Flesch 
Reading Ease Formula” (1), Farr, Jenkins, 
and Paterson propose to replace the count of 
syllables per 100 words in my Reading Ease 
Formula (3) by a count of the number of one- 
syllable words per 100 words. Their proposal 
is open to criticism on three grounds: 

1. The raison d'être for the new formula is 
that it is a “simplification.” Farr, Jenkins, 
and Paterson say that “this simpler method 
would obviously be much faster and would 
require no knowledge of syllabication on the 
part of the analyst.” However, this is not 
obvious at all. The average number of one- 
syllable words per 100 “standard” words is 
about 70 (see Table 1 below), whereas the 
average number of syllables per 100 “standard” 
words is about 150. Since syllables are counted 
practically by counting only those beyond the 
first, this means that the new formula, on the 
average, counts 70 items where the old f 
counted only 50; in other words, the “faster” 
new formula typically means about 40 per cent 
more work in measuring the word factor. As 
to knowledge of syllabication, the analyst 


ormula 


Table 


Comparison of 11 Examples from How to Test Readability 
and Two Reading Ease Scores 


New Vork 


needs just as much for the one count as tor 
the other; in fact, words like “stirred, ies 
or “doesn’t” are more liable to offer problems 
than words like “simplification” or ae 
ability.” So the “simplified” formula a 
well be more cumbersome than the old nail 
2. Farr, Jenkins, and Paterson based a 
new formula on 360 100-word samples n 
22 General Motors employee handbooks. a a 
handbooks ranged from 36 (“Difficult ) 
57 (“Fairly Difficult”), According ie res 
authors, “It is safe to predict” that TA 
ported correlation of .95 between the a ia 
the new formulas would reach .99 if D 
on every level from “Very Easy” to v 
Difficult” had been sampled. ‘That pre 
tion is nof safe. The correlation that ui 
for 22 homogeneous GM handbooks Me 
the narrow range of 36 to 57 is apt to rials 
rather than rise when heterogeneous — af 
over a wide range of Reading Ease rae 
sampled. To illustrate, I offer Table ae 
shows the comparative counts and ee 
the 11 examples given in my booklet we 
Test Readability (2). The examples ™ 


1 


as to Two Syllable Counts 


No. of 


No. of Readi Reading 
No. o eading Gase 
hic ay Syllables Ease Eea a 
EZME! per 100 Score Form 
= e 100 Words Words (Old Formula) pan 
2 20 122 a 7 
3 z 124 e ag 
4 A 131 81 r 
5 3 127 30 r 
ia 70 141 69 ie 
Ta 144 68 : 
L 145 6 
: 70 45 66 
Š 68 152 60 
10 71 o 48 re 
11 58 143 a n 
175 50 
54 


| 


i 


Reply to “Simplification of Flesch Reading Ease Formula” 


Eo the Bite (Example 1, “Very Easy”) 
‘As Tike 1 (Example 11, “Very Difficult”). 
roughly equi Shows, the two formulas give 
arq” te, rie results around the stand- 
more acti i 2; but tend to diverge more and 
“ceiling” n either end of the scale. The 
Ben r n one-syllable words per 100 words 
So the ae about 80, the “floor” about 60. 
rate rhea narmu apparently tends to under- 
range, jt Sie and difficulty. Over the full 
old forn appears to be less sensitive than the 
nula. 

3. The proposed new formula seems @ step 
TN undesirable direction, resulting in a 

r rather than a more precise measure. 


n effec as 
flect, Farr, Jenkins, and Paterson reduce 
and 


readabili 
Ability to the use of short sentences 


Ne-sy]]; Deh Ge 
Wired lable words. This is simplification 
ith ition 


in 


a vengeance, giving- fresh ammun 


Reply to Klare and F 


Reading Ease 
enkins, Donald 


James N, Farr, James J. J 
Depa: 


Both K 
oth Klare (5) and Flesch (3) attack the 


requinent® that this simpler method “won 
Part no knowledge of syllabification on the 
op the analyst” (2, p. 333) and it “woul 
the ee knowledge of syllabificaon p 
Write t of the analyst” (2, P- 33i). The 
at Plead guilty to careless overstatement. 
the -Should be substituted is @ statement to 
less ; ect that this simpler method requires 
na Re knowledge of how to reak ae 
Cause abic words into their component a 
thougy Such words are to be ignored. 
‘ire ù we admit that the simpler me 
n oo as much hone vf 
as does the more complex Me. |. i 
the Be Vantage of the pe method ae 
. Sut that it would obviously be much pet 
Simple both Klare and Flesch deny that hee 
onside method would be faster- are, a“ 
llah] "hg the relationship 
lade Word counts and the 
peliet th, Per 100 words merely 
S at no significant amount 0 
r 


eq š . 
L Flesch, using simu@ 


riment of Psychology, 


5 


mn 


to those critics of readability measurement 
who dismiss it as a movement toward “baby 
talk” or “primer style.” After all, the eae 
of one-syllable words was abandoned by stu- 
dents of readability some fifteen years ago 
To return to it now would coarsen the tech- 
nique of readability measurement and impair 
its value as a diagnostic tool in the improve- 


ment of communication. 


Received December 31, 1951. 
Published out of turn by the editor. 


References 


1. Farr, J. N., Jenkins, J. J., and Paterson, D. G 
Simplification of Flesch reading ease formula, 
J. appl. Psychol., 1951, 35, 333-337. a 
2, Flesch, R. How to test readability. New York: 
Harper & Brothers, 1951, pp. 56. i 
3. Flesch, R. A new readability yardstick. 
Peychol., 1948, 32, 221-233. ck. J. appl. 


lesch re “Simplification of Flesch 


Formula” 
G. Paterson, and George W. England 


University of Minnesota 


also denies that the simpler method would 
be faster. He even goes so far as to argue 
that about 40 per cent more work would be 
required when the simpler method is used. 
As the great Chicago physiologist, Anton J. 
Carlson, was so fond of saying, “Vass iss de 
evidence?” 

Instead of using armchair reasoning, ar- 
rangements were made in October, 1951 to 
obtain the “evyidence.”! The mean time in 
seconds for making the one-syllable word 
counts and looking up the reading ease scores 
in the Farr, Jenkins, and Paterson table (2) 


was 82 with a standard deviation of the dis- 


Applied i H 
old method of computing reading ease scores for 201 
hundred 
an AB, 
immedi 


, Fricke; Richard S. Hatch; Raymond C. 
Richard Č. Maass; Paul W. Maloney; Beng rt Me 


Collum; 
A. Peterson. 


56 J. N. Farr, J. J. Jenkins, D. G. Paterson, and G. W. England 


Table 1 


Comparison of Eleven Examples from How to Test Readability as to Two Syllable Counts and Two Reading 
Ease Scores as Computed by Flesch, by Paterson, and by Mueller 


Number of One-Syllable Words 
per 100 Words Computed by: 


Reading E 


New Formula Old Formula 


Example R.F* D.G.P* J.M.M* RF D.G.P. JMM. RE 
“=i 80 80 80 84 84 84 a 
2 82 85 81 86 91 85 89 
3 77 77 77 76 77 7 51 
4 77 80 77 72 77 72 80 
5 73 71 72 67 64 65 6 
6 70 70 71 64 64 65 68 
7 70 72 72 62 65 65 66 
8 70 72 69 63 65 61 80 
9 68 70 70 57 60 60 48 
10 71 70 71 43 42 43 H 
11 58 61 61 30 36 36 a 


* R, F. refers to Flesch, D. G. P. refers to 
Joyce Mark Mueller who independently rated 
seven samples in Table 2. 


tribution of 36.8. The mean time for making 
the syllable counts and looking up the reading 
ease scores in the Farr and Jenkins table (1) 
was 147 with an S. D. of 62.8. Thus, the 
evidence fully substantiates the claim that 
the simpler method is obviously faster. 

Klare believes that the simpler method 
would magnify each counting error and thus 
decrease reliability, A thorough-going study 
of the reliability of both methods would be 
needed to settle this issue, Data given below, 


however, would not lead one to put much 
stock in Klare’s belief, 


Flesch attacks 
old and 


We have recom 
eleven samples a: 


< ocretary—Mr 
Paterson, and J. M. M. refers to Mr. Paterson’s sereta jivional 
these eleyen samples and also independently rated the ac 


‘Se 


systematic bias at the extremes. It will 
noted that no such bias really exists. F ae 
more, although slight differences in the Lapin de 
lable word counts are shown, no serious a 
in the new reading ease scores are invois 
Also one will note that the new reading p 
scores and the old reading ease scores Ena 
comparable except for Example 9. This N 
passage taken from Thorstein Veblen’s THT 
of the Leisure Class and the discrepancy app® F 
to be due to the fact that when Veblen ed 
polysyllabic word he does so with a venge? 
jumping to five, six, and even seven sylla 
words. It is probable that the old eee 
gives a better measure of the difficulty of dy 
writing than does the new although @ pote 
of a large number of samples from all pat cy 
his book might not show the same discrepans 
Flesch’s claim that the new formula 7 nf 
as sensitive at the extremes of difficulty iS Jes 
borne out by the seven additional a 
taken from the rest of Flesch’s book (4): 12 
data are presented in Table 2. Examp pe 
and 17 show the new formula to be rain 
Sensitive than the old in the sense of yen 
a lower reading ease score. Example 16 a 
the reverse effect—the new formula yie! is 


z 4 is} 
higher reading ease score. But all of th 


) 
l 


Reply to Klare and Flesch re Reading Ease Formula 57 


Table 2 


Comparison of Seven Additional Examples from How to Test Readability as to Two Syllable Counts and 
Two Reading Ease Scores as Computed by Paterson and by Mueller 


Number of One-Syllable Words 
per 100 Words Computed by: 


Reading Ease Scores 


New Formula Old Formula 


Example D.G.P. J.M.M. D..G.P: J.M.M. R.F. 
12 57 58 12 15 me 
13 78 79 81 83 82 
14 75 75 73 73 72 
15 62 63 36 38 22 
16 g4 84 89 89 85 
17 31 54 = 25 29 33 
18 71 72 72 73 82 
ane Eta 


much ado about nothing since both formulas 
yield substantially equivalent results. 

Flesch’s final criticism is that the new for- 
Mula is a step in the wrong direction making 
reading ease formulas even more vulnerable 
to the charge of encouraging “baby talk” and 

Primer style.” We, too, deplore this type 
Sf charge because it is unfair but we believe 
that no larger proportion of the “literary 
Stylists” will attack the new formula than 
aye been attacking the old formula on these 
rounds, Flesch, himself, has given effective 
NSWers to this “baby talk” type of attack in 
4 ow to Test Readability (4, pp. 40, 41, 45, 48, 

2 and 50). 

Tn conclusion, we would stress “time saving” 

the great virtue of the new formula. Tt is 


A . c ay 
Will © hoped that this “time saving virtue 


Dead ty tar utilization of Flesch’s 
bortan a far greater utilizat! ar vate 


i contribution in a greate ; 
Stang ONS than is now the case. AS matter 
» there is reason to believe that many 


practical people think that it takes an expert 
to make readability studies. The purpose 
of the new formula with the table for facilitat- 
ing the computation of reading ease scores 
is to persuade practical men to use it in their 


daily work. 


Received January 7, 1952. 
Published out of turn by the editor. 


References 


1. Farr, J. N., and Jenkins, J. J. Tables for use with 
the Flesch readability formulas. J. appl. Psy- 
chol., 1949, 33, 275-278. 

2. Farr, J. N. Jenkins, J. J., and Paterson, D. G. 
Simplification of Flesch reading ease formula. 
J. appl. Psychol., 1951, 35, 333-337. 

3. Flesch, R. Reply to “Simplification of Flesch read- 
ing ease formula.” J. appl. Psychol., 1952, 36, 
54-55. 

4, Flesch, R. How to test readability. New York: 
Harper and Brothers, 1951. 

P Kiate G R. A note on “Simplification of Flesch 

Á. MAS ding case formula.” J. appl, Psychol., 1952, 
36, 53. 


Correction 


Balinsky, B., Blum, M. L., and Dutka, S. 
The coefficient of agreement in determining 
product preferences. J. of Appl. Psychol., 
1951, 35, 348-351 (Oct.). 

Between the time that galley proofs were 
approved by the authors and the publication of 
the above article a number of unfortunate errors 
appeared. Correction of these errors is im- 
portant if proper. application of the formula 


is to be made. 7 


the article it should be H or (7). Statis- 


tically, these symbols have very different 
meaning: 

The incorrect substitution appears as fol- 
lows: (1) Page 348, column 2, last line; (2) Page 


n om : 
Whenever > or > appears in 


58 


349, column 1, line 12; (3) Page 349, column 1, 
line 14; (4) In formula on Page 349, column 15 
(5) In formula on Page 349, column 2; (6) In 
formula Page 350, column 2; it must also be 
noted that a printer's error occurred since the 
n in the formula was inverted; and (7) Foot- 
note on Page 350, column 2. 

The computation, following the formula on 
page 350, column 1, was incorrectly inserted 
by the authors. The number .0473 should be 
substituted for .095; 35.1 should be substi- 
tuted for 63.7; and .01 should be substituted 
for .001. Fortunately the correction of these 
values in no way changes the conclusions: 


Received December 29, 1951. 
Published out of turn by the editor. 


Book Reviews 


Welford, A. T. Skill and age: an experimental 
approach. London: Oxford Univ. Press, 
1951. Pp. 161. $1.75. 

his is a research report issued by the 
Nuffield Foundation at Cambridge covering 
Work done there during the years 1946 to 1948 
Yy the Research Unit into Problems of Aging. 

Tt is concerned with the study of performance 

Or skill changes that come with increasing age. 
Following a short introduction, the first 

Sl of the book is devoted to a theoretical 

Scussion of these performance changes. 
heoretical explanations are discussed under 
the four broad categories of bodily changes, 

Ps of the enviroment, methods of dealing 

Situations, and anticipatory adjustments. 

act cussion of the “mechanisms of skilled 
ivity within an individual” follows. Here 
he author makes a convincing case for the 
use of a new research design for studying per- 

Srmance. This design emphasizes the im- 

crane of accurate measurement of com- 
nents of total performance so that methods 

Eo in achieving this performance as well as 

results can be determined. pa 

Btn A binei statement As ses ag om 
lites, five laboratory experiments dealng 


y . 
Ti manipulatory skills are reported. Sub- 
cts ranged in age from 18 to 82. Each of the 


eriments illustrates the utility of the ex- 
mental design proposed in the book. The 
ndings point to a compensatory change of 
“tlormance with age in which deterioration 
ee aspect of total performance tends to be 

Rewhat offset by improvement in another. 
fees laboratory. experiments on ngine ot 
Vo E ered (classified by the altior a La 
Sect; ng “mental skills”) make up te ee 

ot id A general falling off of these n 

or skills with age is reported. 

€ final section on experimental results 


D 
i 


de a l 
a With studies made in the industrial 
of son, involving a total of 3,211 employe 
tege, Concerns. A modification of the sam 

oratory 


Wag. hs design as that used in the lab 
jop “Pplied in the industrial setting: - 
tig formance is broken down into are 
> and age distributions for jobs W! 


Total 


59 


and without certain operations are compared. 
Two operations studied were found to be 
associated with the age of the workers. Older 
workers are found more often on jobs in which 
there is no time-stress or pressure for speed, 
and they are more likely to be found in small 
rather than large work groups. 

There is an appendix showing statistical 
significance for all of the findings, and a 23- 
item bibliography. 

As the author points out, general conclusions 
are not justified because of sampling limit- 
ations and the relatively small numbers of 
subjects involved in most of the experiments. 
However, this does not seriously detract from 
the main value of the report which lies in the 
research technique for studying performance. 
The experiments clearly demonstrate the 
usefulness of the technique. The findings 
reported and the discussion of these findings 
suggest a number of hypotheses for other 
researchers in this area. 

Because experimental results are supple- 
mented with stimulating theoretical specula- 
tion into the “why’s” behind the findings, 
the book makes highly interesting reading. 
It is a “must” for other research workers in 
this field and is worth while reading for any 
t in the field of human behavior. 
Appearing at a time when we in America are 
becoming more and more aware of the increas- 
ing proportion of older people in our popula- 
tion and the problems that go along with this 
condition, this book should be received as a 
welcome addition to the relatively sparse 
research literature in this important area of 


applied psychology. 


experimentalis 


Theodore R. Lindbom 


Prudential Insurance Co., 
Newark, N. J. 


Dooher, M. Joseph, and Marquis, Vivienne 
(eds.). Rating employee and sup--isory 
performance. New York: American Manage- 
ment Association, 1950. Pp. 192. $3.75 
(paper bound). $4.00 (cloth bound). 

This manual of merit-rating techniques 
brings together some of the best merit rating 


60 


material published by the American Manage- 
ment Association during past years. Some 
new material is included. The volume is 
comprised of seventeen chapters (including 
exhibits and appendix) by fourteen contrib- 
utors. Included are sections on basic prin- 
ciples and techniques, scientific approach 
toward rating, special adaptations, company 
case histories, applying rating results, and the 
rating form. 

The quality of contributions is uneven, but 
less so than is usually true of books of this 
type. The editors have limited themselves 
to articles prepared for A.M.A. publications, 
but within this area have used excellent dis- 
crimination. True, overlap and repetition 
are frequent; but fortunately arise in contexts 
which give the impression of intentional (and 
deserved) emphasis. 

Included are some classic studies (e.g., 
Driver’s “Case history in merit raling,” first 
published in 1940) which well warrant re- 
publication to be more readily available to 
current readers. Unfortunately this book 
does not cite source or date of original publica- 
tion. This is apt to be a disadvantage to 
compilers and users of bibliographies. 

Most of the articles are written by psycho- 
logists for non-psychologists, and are sound 
both theoretically and practically. Compiled 
primarily for executives, supervisors, and 
personnel technicians, this book will be valu- 
able to all workers in the field of merit rating 
whether professional psychologists or other- 
ee ach no such worker should be 
aes Sa Ee cloth bound volume 

nmended over the paper bound 
volume even if the difference in cost w 


r ere 
considerably greater than it is. 


C. E. Jurgensen 
Minneapolis Gas Company 


Spearman, C., and Jones, L. W 
oom 3 3 c} i . H: 

ability. New York: The Macmillan Co. 

15£9, Pp. 198. $2.50, ý 


Factor analysis began with S 
E ) pearma 
it is fitting that he should be honoured or i, 
‘There is all the more reason for Tegretting that 
this last book of his was writte 


ieee n. The exact 
contribution of Wynn Jones is not mentioned 
3 


Book Reviews 


“ 
but in the preface Wynn Jones states, <v 


when he (Spearman) proposed that the boo 
should appear under our joint names, I had E 
point out that my share in it did not deem 
such distinction.” It seems, therefore, pee’ 
Spearman was responsible for the greater par 
of the work. ‘The book is sub-titled, “A Con- 
tinuation of the Abilities of Man,” and it 1s 
just this which makes the criticism so adverse. 
In fact Human ability is little more than 4 
restatement of the earlier work. Even the 
same quotations reappear. . -ni 
Tt is true that there is a fuller discussion ae 
group factors, and it is in some ways oer 
ing to realize that Spearman was very n p 
the truth when he said that group a 
“~~ have been small or rare,” for this make 
vocational guidance, if not selection, & E 
more dificult problem. Again, am 
deserves credit for attempting to give A 
factors more psychological meaning ian, x 
many others. He is not merely sone E 
play with numbers; but this we knew belo 


Douglas Irvine 


National Institute of Industrial Psychology: 
London, England 


Panel on Psychology and Physiology 

Committee on Undersea Warfare of the 
tional Research Council. A survey rep 
on human factors in undersea arch 
Washington, D. C.: National Rese 
Council, 1949. Pp. 541. $2.25. 


“Should another war come, victory yis 
well be, not on the side of the strongest pas ed 
lions, not even on the side of the best ae 
missiles, but on the side which has gai jing 
vital 10 per cent in the successful han pas 
of human factor problems.” This thoug®™ g 
apparently guided the metamorphos'S 
report on research needs for a part ° 
National Military Establishment int? r 
amounts to a survey of the field of iE ak 
experimental psychology. While there ett 
ways reference to the psychological aP pe 
of submarine operation in particulat ye 
material surveyed is relevant to a wide ot 
of industrial and military problems O ; 
efficiency. a 


With some 30 contributors produci"é 


ay 


Book Reviews éi 


data-packed chapters on as many special prob- 
Sns, a cohesive review is difficult. However, 
). R. Craig and D. G. Ellson, in their discus- 
Sion of the design of controls, present a concept 
around which the far-flung content may be 
organized. ‘This is the concept of the operator 
as a bio-mechanical link in a control system. 
if the reader holds this concept in mind, the 
diverse topics can be seen as special aspects 
a the input and output problems of such a 
ink, 
po input factors, the whole topic of 
for ation comes to the fore as a useful tool 
Ra ging the optimal and limiting condi- 
Surv or the reception of information. The 
and + devotes several chapters to the basic 
in oti ie principles of vision and audition 
OF dis a of military tasks. Design 
ations Is, complex visual displays, diina 
Special t radar and sonar, are but a few of the 
al topics discussed. 
os a link in a control system, the individual 
“~ onds to the information he receives with 
ae activity. Analysis of this F 
o va conceived as output, leads to the study 
eed skills, These are reviewed = k e 
design t movement accuracy, contro pa 
The. and the arrangement of working areas. 
oh ee between input ad pt 
Which iar by a large number o an 
Surve ay be designated as operator varie = 
h ys of performance as affected by condi 


tion 

3 mi habitability, emotional stres 
a Bor: of selection and training u 
the pe , Inclusion of these topics 
eta Sibility of construing the concept OF 
Mo, i as an instance of mechanomoxphist™ 
F leadership, mental health, chido 
, s are all considered 1" the e 
nion, Mfluence upon the efficiency of Oper 
hus, the operator is not viewed aS ® 
Machinery, even though his contr 


Consider i -the central 
siqa © 2 Man-machine system '$ the 


Pop tion, 
a 

, pitay he applied fields of indust! 
T tlic chology, or for any field i 
a tiio utilization of manpowe ' 
vap è Bulge ayey provides & factual 
5 idle th, to needed research that 1 
hay an its title implies- he 4 

T here presented may, indee 


rial and 


n which 


s far M 


to be as crucial for our times as is suggested 
in this review’s opening quotation. 


Wallace A. Russell 
University of Minnesota 


Berrien, F. K., Comments and cases on human 
relations. New York: Harper and Brothers, 
1951. Pp. xi + 500. $4.50. 

This volume is another of those inspired by 
the case study techniques of the Harvard 
Graduate School of Business Administration. 
As the author says, “This book is the out- 
growth of a somewhat accidental but extremely 
stimulating experience in the fall and early 
winter of 1945, when I had the privilege of 
watching for one semester the initial instruc- 
tion in Human Relations at Harvard College.” 
The approach in this book is that the psychol- 
ogy of human behavior can be taught in the 
form of “human relations.” The author 
indicates the’ objective of the book as, “I 
think of my audience as being composed of 
college students and members of adult educa- 
tion classes—not teachers or experts. At the 
same time I have tried to ‘keep the cookies on 
the middle shelves’—high enough to require 
some stretching put still within reach. This 
is no attempt to build a system or a grand 
theory of human relations. I have tried, how- 
ever, “to develop & picture of human relations 
problems in a self-consistent matter around 
the theme of a need for social harmony and 
self-actualization. ae 

The book is divided into two parts. Part I 
consists of à number of chapters based around 
oe s human relations concepts. This de- 
t constitutes the first 246 pages. 
a series of case studies— 
Included at the end of the 
hort Instructor’s Appendix, 
“The 


Relations course, | 
and cases are designed, presupposes 
of teaching and a set of objectives 
discussion.” He further states, 
a dputtoned-Up,” packaged conrse. 
om for & great deal of mdividual 
< ative and variation on the part of instruc- 
nee ore effective means of developing 


tors tO pi ity i students for their own educa- 
” 


a Book Reviews 


There can be no quarrel with the purpose 
or intent of the author in devising this volume. 
However, a book of this kind, which is designed 
for adult educational groups, probably would 
have been better written if it had been more 
simply written. For example, the author 
says on page 46, “So far in this chapter we 
have pointed out the subjectiveness of ob- 
servations and the importance of perceiving 
differences among similarities. The task of 
synthesizing our many discrete observations 
into some kind of order remains before us.” 

After spending two chapters in developing 
the need and the setting for the problem of 
cooperative behavior, the author proceeds 
with a rather complete discussion of the prob- 
lem of words. Here again, it seems to this 
reviewer that the author falls into the trap 
of using extremely complicated words and 
phrases to explain human relations problems. 
For example, on page 26, he says, “We have 
at least two kinds of words, or, more precisely, 
words may have two kinds of meanings. The 
semanticists refer to these two kinds of mean- 
ings as extensional and intensional. The 
extensional meaning of a word is the object 
or event in the objective world which the word 
denotes.” Here again it is difficult to con- 
ceive of a group of adults, unless they are 
familiar with the background of semantics, 
being able to understand, without a great 
deal of assistance, what the meanings of these 
words are. If the study group in “human 
relations 'S going to spend a good portion 
of its time in developing a theory of “human 
relations” and “human behavior” prior to 
the time they are going to use the case studies, 
then this book in itself is incomplete. 

The book is well annotated, and reference 


is constantly made to the most recent liter- 
ature. 


Howard P, Mold 


Minneapolis-Honeywell Regulator Company 


Gouldner, Alvin W. (ed.). 
ship. Leadership and 
New’ York: Harper an 
Pp. xvi + 736. $5.00. 
Dr. Gouldner has a 

he groups and presen 

types of leaders; | 


Studies in leader- 
democratic action. 
d Brothers, 1950, 


ssembled 31 Papers which 
tsin five Parts as follows: 
eadership and its group 


settings; authoritarian and democratic leaders; 
ethics and technics of leadership; and afirma- 
tions and resolutions. At the beginning of 
each part, the author has written a section 
which is helpful in orienting the reader in the 
group of papers which follows, and the author's 
introduction is an excellent discussion of some 
of the theory and problems involved in the 
study of leadership, particularly the conteo 
versy between the trait and situationist ap 
roaches. i 
i Most of the papers were written by sociol- 
ogists, many of them being unpublished p 
viously. However, although this type ful 
volume is needed and could prove very help i 
to psychologists, the coverage is such yr 
much of its possible value is lost. While t 4 
book carefully points out the mass of literature 
which exists on leadership, the material Pee 
sented offers little in the way of apoo 
problems, specific methodologies, or spee 
theories supported by experimental eviderice 
Therefore, although the mass of ange 
again is increased considerably, the study j 
leadership generally is neglected. The ae 
by Eaton, “Is scientific leadership elect 
possible?” constitutes the single major ror 
ception to this, although he limits his discus 
too specifically to military studies and cet wath 
sociological approaches, and neglects eon 
important considerations as criteria for lead ; 
ship and an evaluation of the devices discus 
The recent work of Gardner, Carter, onie 
Henry, the long-term research of the at 
State Leadership Studies, the review of m 
ship trait literature by Stogdill, as We j 
other recent experimental contribution at 
psychologists and sociologists, receive 2° 


: : Jume- 
tention by Eaton or anywhere in the vo 


et 
It might appear that Dr. Gouldner ee 
papers for their value in contributing t° i/ot 
Propounding certain social theories a ene 
social reforms and that leadership phenom” Jar 
are considered and evaluated in a parte a 
social frame of reference. There is, ye 
general assumption that democratic leade™ g- 
as these writers discuss il is the leadership gh 
ward which all should be striving, even 7 Jemo” 
there is little attempt to structure the “. ppe 
cratic action” which is the sub-title ° om 
book. Perhaps the following statement 


Book Reviews ae 


Gouldner’s summary remarks on all of the 
oe characterizes the bias of the approach: 
ee of social policy, there isa tendency 
and ‘Ja me that the era of ‘free competition 
form a faire is well behind us, that some 
oira planning is inevitable, but that this 
oe es real dangers to democratic liberties 
th Which new safeguards have to be invented, 
ae fhe ‘nationalization’ of industry is not 
ntical with its ‘socialization. ” The ap- 
ai is illustrated further in the subject 
ex which lists 51 references to “apathy, 
ie no general references to “criteria” or 
Ee noas", 88 references to leadership among 
us minority groups and trade unions, 


u . . 
ie no general references to business, 1m- 
Strial, religious, or educational leadership. 

leader- 


ne of the valid criticisms of many 


ship studies in the past has been the indefinite- 
ness and vagueness which characterized them. 
On the other hand, much recent work has been 
characterized by the attempt to structure, 
to formalize, to coordinate and organize some 
of the theory of leadership and the methods 
for experimentation and study. Gouldner’s 
readings make no major contribution to this 
recent work, and the value of the book is thus 
proportionately decreased. However, much 
of the reading is interesting, including the 
emphasis which is placed on the thinking of 
Freud, Weber, Marx, Durkheim, and Mann- 
heim as it relates to leadership and social 
action, as these writers see it. 


C. G. Browne 
Wayne University 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editors 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Mental health and Hindu psychology. Swami Akhila- 
nanda. New York: Harper and Brothers, 1951. Pp. 
231. $3.50. 

An introduction to projective techniques. Harold H. 
Anderson and Gladys Lowe Anderson, Editors. New 
York: Prentice-Hall, Inc., 1951. Pp. 812. $6.75. 

Counseling from profiles. George K. Bennett, Harold 


G. Seashore, and Alexander G. Wesman. New 
York: The Psychological Corporation, 1951. Pp. 95. 
$1.75. 

Occupational filing plan and bibliography. Wilma 


Bennett. La Porte, Ind.: Sterling Powers Publish- 
ing Co., 1951. $3.00. 

Concepts and programs of counseling. 
editor. Minneapolis: University of Minnesota Press, 
1951. Pp.81. $1.75. 

Statistical methodology reviews 1941-1950. Oscar K. 
Buros, editor. New York: John Wiley and Sons, 
Inc., 1951. Pp. 457. $7.00. 

Punched cards. Robert S. Casey and James W. Perry, 
Editors. New York: Reinhold Publishing Corp., 
1951. Pp. 506. $10.00. 

A primer for psychotherapists. Kenneth Mark Colby. 
New York: The Ronald Press Co., 1951. $3.00. 

The executive at work. Melvin T. Copeland. Cam- 
es Harvard University Press, 1951. Pp. 278. 

3.75. 


Ralph F. Berdie, 


A hundred years of psychology. J. C. Flugel. New 
York: The Macmillan Co., 1951. Pp. 424. $3.25. 


The dynamics of psychological testing. Milton S. Gur- 
vitz. New York: Grune and Stratton, 1951. Pp. 
412. $5.50. 

Air war and emotional stress. Irving L. Janis. New 
York: McGraw-Hill Book Co., Inc., 1951. Pp. 280. 
$5.00. 

The drawing-completion test. G. Marian Kinget. New 
York: Grune and Stratton, 1951. About 200 pages. 
About $7.00. i 


How to hire a bus operator. Merwyn A. Kraft and 


Glen U. Cleeton. New York: American Transit 
Assoc., 1951. Pp. 28. $2.50. P 
Basic methods of marketing research, James H. Lorie 
and Harry V. Roberts. New York: McGraw-Hill 


Book Co., 1951. Pp. 453. $6.00. ; Saa 
Language and communication. George A, Miller. a 
York: McGraw-Hill Book Co., 1951. Pp. 2 


$5.00. 
On being intelligent. Ashley Montagu. L 
Henry Schuman, Inc., 1951. Pp. 236. $2.95. $ 
Learning theory and personality dynamics. O. H. Mow- 
rer. New York: The Ronald Press Co., 1951. Pp. 
776. $7.50. tte 
The family scrapbook. Ernest G. Osborne. New York: 
Association Press, 1951. Pp. 457. $3.95. a 
Human relations in supervision, Willard E. Parker a 
Robert W. Kleemeier. New York: McGraw-Hi 
Book Co., 1951. Pp. 472. $4.50. ioia 
Sex and the law. Morris Ploscowe. New York: Pren 
tice-Hall, Inc., 1951. Pp. 310. $3.95. 
Chronic disease and psychological invalidism. 
Ruesch. Berkeley: University of California 
1951. Pp. 191. $3.50. 
Spinoza dictionary. Dagobert D. Runes, Editor. + 
York: Philosophical Library, 1951. Pp. 309. $5. a 
Thematic test analysis, Edwin S. Shneidman, Kenne d 
B. Little, and Walther Joel. New York: Grune an 
Stratton, 1951. Pp. 325. $8.75. 


How to help your child develop successfully. 
Haller Gilmer. 


Pp. 368. $2.95. 

The clinical method in psychology. Robert I. W: 
New York: Harper and Brothers, 1951. PP 
$5.00. 

Psychoanalysis and culture. George B. In 
Warner Muensterberger, Editors. New York: 4! 
national Universities Press, 1951. Pp. 462. 

Foreman‘training in a growing enterprise. A 
Boston: Harvard Business School, 1951. Pp- 
$3.50. 


New York: 


Jurgen 
press, 


à 1. 
New York: Prentice-Hall, Inc» 195 


atson- 


d 
bur an 
wil ter? 


Journal of Applied Psychology 


VoL. 36, No. 2 


APRIL, 1952 


Nineteen-Year Followup of Engineer Interests 


Edward K. Strong, Jr. 


Stanford University 


a tawing only a college freshman’s score in 
i gineer interest, can one predict with some 
oe of certainty: (a) his college major; 
oh. occupational choice while a freshman or 
et omore; and (c) the occupation he will be 
ie in 19 years later? Actually one can- 
igi, pees the specific college major, occupa- 
a. choice, or occupational career but one 
Geu predict surprisingly well whether the 
fon pees will be engineering, some occupa- 
othe closely related to engineering OF, at the 
ae r extreme, some occupation quite unrelated 
engineering. 
th he data in this investigation are based on 
i Vocational Interest Blanks of 306 Stanford 
io iversity freshmen of 1930, a goodly propor- 
n of whom also filled out the Blank in 1931, 
inf’ and 1949. On each occasion extensive 
edu mation was obtained regarding their 
cation, their vocational choice, and the 


positions they had held, together with a 
varying amount of reaction to their past and 
present activities. 

Consider first how permanent or persistent 
are the interests measured by the Vocational 
Interest Test, and second, how well measured 
interests predict choice of occupation. 


Reliability of Interest Scores 


The popular notion is that interests change 
so often and so unpredictably that no forecast 
can be made on such a basis. The facts are 
otherwise as is demonstrated below. 

Using the odd-even technique, the reliability 
of the engineer interest scale of the Vocational 
Interest Test is .936 (3, p- 77, 4). Burnham 
(1) reported a coefficient of .95, using the 
test-retest technique for one week and Glass 


(2) reported .92 for one month. 
Permanency of engineer interest scores with 


Table 1 


Test-Retest Correlations 


a Permanence of calor Scores, el E 
j a . Interval in Years 
7 5 8 9 10 1 
m oa „oÑ oe ae 
High school juniors: g 150 83 
ollege freshmen 247 So 
ollege freshmen 350 75 “ 
ollege freshmen 185 : 7 
Ollege freshmen 203 92 76 
Olle 158 82 
oll Be sophomores ` P 
Co ege sophomores 177 ai 
Col ho Seniors 27 8 
(yy B® Seniors is 87 
lege fy 39 — — 
t = cshmen 10 years later . es 4) re jotta «7! for engineering students who con- 
Pye Bu i — on; lass (2)! md Van Dusen (7) reports .85 for 76 college 
i apa in Tham a g8 college fresh on of college; & 
, me? Colleg, D) reports .78 for 185 no dropPe "> of these- á 
n, llege and «66 for 85 students WPO ‘ented 2V0 h 
€ .75 given in the table is 2 “ 65 ee say, e 
ie Aat! gdni. i! 
inu Q LLEN 


66 


our college freshmen is .91 for one year, Ad 
for nine years, and .76 for nineteen years. 
For the ten year interval between 1939 and 
1949 the correlation is .87. See Table 1. 
Permanence of interest scores from 1 to 22 
years has recently been published (5) in which 
two profiles of 34 interest scores were corre- 
lated. These coefficients average .06 lower 
than those given in Table 1. The two different 
methods of calculating permanency of interest 
scores may be responsible for the differences in 
coefficients. It is more likely that the differ- 
ence is to be explained on the basis that the 
engineer interest scale is one of the most 
reliable of our scales, averaging .06 higher than 
the average reliability of .877 for 36 scales. 
Reliability, or constancy, of engineer interest 


Edward K. Strong, J r. 


of cases. Records of several hundred men 
tested before and after their war experiences 
showed very little change in their profiles of 
34 occupational interest scores. But there 
were real changes among a few. 

Among these freshmen there were 101 cases 
among 1115 in which there was a shift of 15 
or more in engineer score. In terms of the 
distribution of scores about 36 such cases might 
be expected but not as many as 101 cases. 
Among these exceptional cases there were some 
who shifted 15 or more in one direction between 
1931 and 1939 and shifted back again between 
1939 and 1949. Such men are counted twice- 
Elimination of these 9.2 per cent of the total 
reduces the average standard deviation from 
8.8 to 7.0. Even with the elimination of these 


Table 2 


Standard Deviations of Differences between Test and Retest Scores 


Differences of 15 and More 


All Data Omitted (9.2% of Cases) Per Cent of 

7 Scores wit 2a 

Ratio of Ratio of Differences o 

Scores N S.D. Average N S.D. Average 15 and Mort 
45 to 68 227 74 84 217 64 91 4.4 
30 to 44 390 8.8 100 357 7.3 104 8.4 
15 to 29 330 8.7 99 298 6.8 97 10.2 
—5 to 14 168 10.3 117 142 7.6 108 15.4 
Total 1,115 8.8 100 1,014 70 100 9.2 


scores is shown in general in Table 1. Reliabil- 
ity is not equal, however, over the entire range 
of scores ranging from 60 to —5. The stand- 
ard deviations of differences in scores between 
test and retest have been calculated for each 
of six intervals, i.e., 1930-31, 1930-39, 1930-49 
1931-39, 1931-49, and 1939-49. Averages of 
the six sets of data are given in Table 2. The 
A ratings (scores of 45 to 68) have the lowest 
standard deviation, amounting to only 84 per 
cent of the average and the low scores of —5 
to 14 have the greatest deviation, amounting 
to 117 per cent of the average. It is fortunate 
that the high scores, upon which interpretation 
very iazely rests, are the most reliable of all 
scores, 

Constancy of interest scores over long 
periods of time is remarkable. 


: But this is not 
so true for a small minority of 5 to 10 per cent 


extreme cases there is clear evidence that a 
high scores are more reliable and the very wp 
scores are less reliable than the averea 
although now the differences are not so p ; 
able. “The last column of ‘Table 2 makes ©% 
that part of the greater reliability of i 
scores is the relative absence of cases W 
large shifts in score. in 

There is no noticeable regression of score ad 
the interval of 1930-31 and very slight regi is 
sion in the interval of 1939-49, but ther net 
appreciable regression in the four 39; 
intervals between 1930-39, 1930-49, 1931 test 
and 1931-49. Data from all six test"? nd 
intervals have been combined, howeve", eet 
the data given in Table 3. High engh nd 
scores of 60 to 68 regress 5.4 downwat' rå 
the low scores of —5 to 9 regress 8.4 up? op 
On the engineer scale the point of no regre’ 


Nineteen-Year Followup of Engineer Interests 67 


Table 3 


Regression of Engineer Interest Scores 


Table 4 


Mean Engineer Interest Scores of College Freshmen 


Score N Regression Year 

60 to 68 27 34 Tested Mean* Sigma 

50 to 59 120 —29 1930 31.3 14.3 

40 to 49 202 -— 3 1931 31.0 15.8 

30 to 39 268 1.4 1939 34.1 14.2 

20 to 29 228 2.8 1949 34.2 14.3 

10 to 19 191 3.2 

—Sto 9 79 8.4 * Differences in mean scores from 1930 to 1939 and 

1949 are significant at the 5 per cent level and differ- 

Total 1115 1.6 ences from 1931 to 1939 and 1949 are significant at the 


's about 38. No explanation occurs to us as 
to why scores regress upward and downward 
from 38 as 38 is above the average score of 31 
of Non-engineers and much above the chance 
Score of 23, 

There may be high permanency of scores as 
Measured by correlation or standard deviation 
of differences in scores and at the same time 
there may be increase or decrease in mean 


1 per cent level. 


scores. In this case mean scores have changed 
very little as is shown in Table 4. College 
freshmen scores did not change when retested 
as sophomores but did increase by about 3 
scores in 1939 and 1949, 


Distribution of Engineer Interest Scores 


Table 5 gives the distribution of engineer 
interest scores of the criterion group of 513 


Table 5 


Distrib 4: 
'Stribution of Engincer Interest Scores of Adult 
of Freshmen who became Engineers, 


Engineers; College Freshm 
Chemists, etc., Physicians, 


en and Seniors; and 4 Sub-Groups 
and Lawyers in 1949 


Occupation in 1949 of Freshmen 


5 3 31 21 
513 306 285 24 1 31 
= Engineers Freshmen Seniors Engineers Chemists, etc. Physicians Lawyers 
70 ; 
65 
60 a 5 4 13 23 
55 17 4 2 25 8 3 
50 6 6 17 8 13 
45 1$ X 8 21 23 6 
40 uo 4 7 8 23 10 5 
35 i 9 13 13 5 
3% ; 13 20 4 8 23 23 
35 7 8 6 5 
20 2 2 13 
1 15 15 19 
Is 9 10 3 5 
10 8 5 10 33 
5 4 2 5 
0 ; 4 
<5 
1 Pea 
Mean oan ait 31.3 50.4 33.9 22.0 
o i . 8.6 10. 12.4 
A 10.0 14.6 14.8 10.0 
t cent 
With «Clapping 99 
hs engineers 46 45 ja ii 16 


68 Edward K. Strong, Jr. 


adult engineers and of the 306 freshmen. The 
two groups overlap 46 per cent. The freshman 
group is heterogeneous. When sub-groups are 
isolated, we find that freshmen who later 
become engineers overlap 99 per cent with the 
criterion group; those who become chemists, 
physicists and geologists overlap 91 per cent; 
physicians overlap 48 per cent; while lawyers 
overlap only 16 per cent. The interests of 
chemists, physicists, etc. correlate about .85 
with the interests of engineers, the interests of 
physicians correlate .52 and the interests of 
lawyers —.44. As the correlations between 
engineer and other groups decrease from 1.00 
toward —1.00, the mean engineer interest 


scores decrease and the per cent of overlapping 
between the two decreases. 


Freshman Engineering Interest and 
College Major 


Table 6 gives the distribution of engineer 
interest scores according to college major. 
The majors are arranged in order according to 
the mean engineer interest scores of the 
students so enrolled. Originally two tables 
were prepared, one concerned with under- 
graduate majors and one with graduate majors. 
As the results were very similar, the two tables 
have been combined. Men whose under- 
graduate and graduate records are both known 


Table 6 


Distribution of Engineer Interest Scores of Freshmen According to College Major: 


ajor of Undergraduates 


and Graduates Combined (Men Who Did Both Are Counted Twice) 


Engineer Interest Scores 


Engineer 


10-19 


College Major 0-9 20-29 
Engineering 
Chemistry 
Physics 
Geology 2 
Medicine 5 10 
Biology 1 35 
Education 2 1 
Mathematics 2 
Accounting 1 3.5 
Business 1 7.5 5 
English 0.5 5.5 9 
Economics 6 19.5 23 
Psychology 1 1 
History 1 5 45 
Art, Music, Drama 3 
Political Science 5 6 85 
Social Science 3.5 3 + 
Philosophy 3 
Law? inon 6 8 5 
Foreign Language 1 2.5 3.5 
Mise. 2 


paei ENOO Average 

30-39 40-49 50-59 60-64 Total Score _ 
8.5 18.5 17 6 50 48.5 
4 4 3 2 13 47.0 
2 2 1 5 44.5 
2 5 2 11 43.0 
21 9 7 52 35.0 
15 3 1 10 33.5 
3 1 ; 31.5 

af 6 29 

3.5 0.5 E 28.5 
9 2 1 25 28.0 
= 1 1 22 27.0 
20 7 2 TE 25.5 
2 4 23.5 
2 1 13.5 22.5 
3 22.5 
8 27.5 22.0 
3 1 14.5 21.2 
1 3 21.0 
8 1 28 21.0 
i 20.0 
0.5 1 1 45 38.5 
4 Ae 
31.0 


— 


NVineteen-Year Followup of Engineer Interesis 69 


are counted twice in the table. The records 
are fairly complete based upon the reports 
rendered in 1939 and 1949. No discrepancies 
were noted between the two records except 
that some men had not finished their academic 
work by 1939. A few students completed 
only one or two years of college work. They 
are included in Table 6 if their major could be 
determined from either the courses taken or 
their statements; otherwise they are excluded. 

Table 6 and others that follow contain halves, 
as, for example, 8.5 freshmen majoring in 
engineering with an engineer interest score of 
30-39, Some students gave majors in two 
fields, as economics-accounting, or they changed 
their major, for example, from engineering to 
physics. In such cases each major is tabulated 
as “3. The data actually record choices 
not men. 

_ There were 101 freshmen who scored 40 and 
higher in engineer interest score. See Table 6. 
Of these 101 freshmen, 41.5 majored in engi- 
neering, 19 in chemistry, physics and geology, 
20 in medicine and biology, and 20.5 in some 
other field. The proportion scoring in engi- 
neering is higher if only scores of 50 and above 
are considered, i.e., 51 per cent in this case in 
Contrast to 41 per cent when scores of 40 and 
higher are considered. In other words, if a 
freshman rates A in engineer interest there are 
about 82 chances in 100 he will major in a 
Physical or biological science. 

_ There is a close relationship between engineer 
interest scores of freshmen and the subject 
Matter of their academic work. See Table 6. 
a the engineer score decreases the studeni 
najors progressively shift from: physica 
Sciences to biological sciences, to accounting 
and business, to social sciences, to law, and 
© foreign languages. 

Because there are very few students enrolled 
™ most of the majors not too great reliance 


bese! be placed upon the data for these separate 


Majors. If there were more cases We should 
athematics 


confidently expect the major of m 
° fall between geology and medicine and for 
“ducation to appear somewhat lower in Table 6. 
. Scores on the engineer scale and @ few other 
Mterest scales, such as lawyer and accountant, 
Ought to provide a good indication of what 
major the student would find appropriate 
r him, 


Engineering Interest and Freshman 
Occupational Choice 


When a freshman scores high in engineer 
interest does he choose engineering for his 
future occupation or most any other occupa- 
tion? 

Table 7 lists all the occupations chosen by 
three or more freshmen among a total of 270 
freshmen. Engineer scores are also given for 
28 freshmen who didn’t know what occupation 
they would enter. There were in addition 8 
freshmen who gave a college major instead of 
an occupational choice whose records are not 
included in Table 7. 

A total of 48 freshmen chose engineering. 
The engineer interest scores of these men are 
given in the first row of data in Table 7, with 
a mean score of 45.0. Similarly the engineer 
scores of the 46 freshmen who chose law are 
given with the mean score of 21.0. The last 
row of the table gives the average correlation 
between engineering and the occupation chosen 
by freshmen. Thus in the column headed 45 
there are given the number of freshmen choos- 
ing the listed occupations, 7 choosing engineer- 
ing; 2 choosing chemistry; 2, geology; etc., 
down to 2 choosing a specific business activity, 


namely “wholesale grocer” and “auto busi- 


ness.” o 
Table 7 clearly indicates that as mean 


engineer interest score (last column of the 
table) decreases the occupation chosen by 
the freshmen differs more and more from 
engineering. In order to summarize the data 
it is necessary to express in some statistical 
form the relationships between choice of 
engineering and choice of other occupations. 
Until we have some way of expressing such 
relationships we can not conclude, except on 
the basis of judgment, whether a shift from 
engineering to medicine is greater or less than 
a shift to law. To measure such relationships 
we have used correlation coefficients between 
the interests of engineers and the interests of 
men in other occupations. Thus, the amount 
of change from engineering to chemistry, or 
medicine, or law is expressed by the evtéspond- 
ing coefficients of .88, 52, and —.44 (3, 4). 
Unfortunately there are no such coefficients 
for many occupations which our students have 
chosen. On the basis of known correlations 


Edward K. Strong, Jr. 


146 6 9L GE 6s 6S e 1c or A 97 oF oL? uone ə8eIaAy 
geg £ I z I z e z 9 z z fe z 8z «^U 7,u0q,, = 
rie t Ir 91 LI og og 6e o st o Sol ST ozz WL o7 
8°02 g I CT £ soyurg, t= 
eez I I I I I if 9 quaursaauy os*— 
oT? I £ 8 9 8 t SL sg oF JOAMVT w- 
0°02 I I A a SF uoneonpy = 
A z S'I £ £ £ SZ oF S'I z Ez ssəursng 9yPədg gg 
S'8I se s s fi s ƏAIƏŞ UBIO oz — 
A I S'E s 9 oF SIT sg t A oF «SSOUISNg,,, or— 
TT g kA + if T g 6 wsgeumof sT- 
POC I ST I S T s9 s ZI snoaurl[aostyyl sr 
PFE I se ce £ 9 L Zz s z cz sse uvpisAgg zs 
Ver S'I S I T ? sosdy gg" 
Ver T z SZ £ s'8 4301039 8s" 
LT T £ S7 z st z b Pi S'6L ysrmay, 8s" 
o'sh £ Z S'S A TI £ 8 ST SP Joousuy 00°T 
əBeay $89 gs os SP OF se og ST oz SI or 6 N OS6T ur eD10q) Jouu 
09 s— puonednooo pm 
uonepog 


O6T UE S291095 389194U7 199W BUFI 


204I [euorvdnssQ ULUWYSILI 0} IAYVJY S2109 4S9197UJ IOUS 


4 IPL 


Nineteen-V ear Followup of Engineer Interests 71 


estimates have been used for the remainder. 
The writer is confident that the majority of 
estimates are sufficiently accurate for this 
purpose. Although we possess a rather un- 
usual amount of information about these 
freshmen there are a number of cases where all 
we have is the title of the occupations and 
there is no way of knowing what functions 
are performed. For example, does “moving 
Pictures business” mean technical work and 
if so, is it engineering or artistic in nature, or 
does it mean operating a moving picture 
theater as a small business man? Our answers 
to such questions have been influenced by all 
the positions the man has held. The most 
difficult estimate to make concerned the term 
business.” This term covers a wide range of 
activities, some of which do not correlate 
Positively with one another, as for example, 
the correlation of —.63 between advertising 
man and production manager. Here we 
Considered all the known correlations between 
engineer and business activities and arrived 
at the correlation of —.19 to represent the 
term “business.” As most business activities 
aside from production correlate negatively 
Up to —.78 for selling life insurance it is 
Possible that the coefficient of —.19 should 
€ as low as —.40. An average of our cor- 
Telations for specific business activities gives 
a coefficient of —.25. If —.25 had been used 
Mstead of —.19 the average of all cases would 
Mave been decreased from .20 to .19, an incon- 
Sequential amount. Various calculations of 

is sort lead the writer to believe the esti- 
Mates are sufficiently accurate for the general 


Purpose of this article. 
Sing such coefficients to express n 
T. ange from engineering to other occupations, 
able 7 indicates that as such correlations 
€crease from 1.00 to —.51 the mean engmeer 
Scores of freshmen choosing the occupations 
crease from 45.0 to 20.8. See first and last 
Columns of Table 7. ‘The relationship may be 
enpressed in another way, namely, as the 
Sineer scores of freshmen decrease 
e 5 the average correlations between 
fn Steering and occupational choices ae 
oT 97 to —.40. See top and eee 


‘able 7. The correlation between the t 
Mea d correlation 


i 


amount of 


Sures, i.e., engineer score an 


from 65° 


Table 8 


Mean Engineer Interest Scores of Freshmen According 
to the Degree Their Occupational Choices 
Agreed with Engineering 


Occupational 
Choices 
Distributed 
According to 


Engineer Interest Mean Scores 


Correlation with Freshman Sophomore 

Engineering Choice Choice 
1.00 45.0 48.8 
.71 to .99 44.3 45.5 
Al to .70 33.9 35.6 
-00 to .40 34.4 35.5 
—.01 to —.40 18.7 25.8 
—.41 to —.70 22.0 23.1 
—.71 to —.99 — 21.4 
Average 312 32.0 


between engineering and occupational choice, 
is .66. ‘ 
A table similar to that of Table 7 was 
prepared showing the relationship between 
engineer interest score in 1930 and sophomore 
choice of occupations in 1931. The results 
were similar to Table 7 and so are not repro- 
duced. This is not surprising since the 
correlation between freshman and sophomore 
choice is .80 (6). Table 8 summarizes Table 7 
and the unpublished table, showing the close 
agreement at the sophomore level between 
engineer interest score and occupational choice 
among those students selecting engineering, 
chemistry and closely related occupations. 
The rises in scores of 45.0 to 48.8 and 44.3 to 
45.5 are caused by a few men with low engineer 
scores changing their choices from these 
occupations to non-engineering occupations. 


Freshman Engineer Interest and Occupations 
19 Years Later 


Occupations engaged in in 1949 have been 
assigned to seven groups according to the 
correlations between the interests of engineers 
and the interests of men in other occupations. 
See column 1 of Table9. Detailed distributions 
of engineer scores are supplied in Tudie 5 of 
the 20 freshmen who became engineers, the 13 
who became chemists, geologists and physicists, 
the 31 who became physicians, and the 21 
who became lawyers. Table 9 again leads 


72 Edward K. 


to the conclusion that as engineer mean scores 
increase from 22.5 to 50.4 the occupation 
engaged in 19 years later approximates more 
and more closely that of engineering, that is, 
the correlation with engineering increases from 
—.71 to 1.00. And the same conclusion 
results whether interest scores obtained in 
1930, 1931, 1939, or 1949 are employed. The 
last column of the table expresses the above 
in terms of total overlapping. For example, 
the engineer interest scores of chemists, 
geologists and physicists overlap 93 per cent 
with the scores of engineers whereas the 
scores of salesmen overlap only 10 per cent. 
The correlation between engineer score in 
1930 and extent to which the occupation- 
engaged-in in 1949 deviates from engineering is 
55. This coefficient is not as high as the .66 
between 1930 score and 1930 occupational 
choice. But it is amazingly high for a correla- 
tion between freshman score in a single occupa- 
tional interest, and occupation 19 years later. 
Many factors contribute to choice of one’s 
occupational career. For the total population 
intelligence is a very important factor. But 
for our specific group of freshmen intelligence 
must play a much less important role. Min- 
imum requirements of academic ability in high 
school and intelligence test for admission to 
Stanford University are high enough so that 
most students can enter most occupations in 
so far as general ability is concerned. Lack of 
finance is a factor largely unrelated to intelli- 
gence or interest. An example of how lack of 
money interfered is that of a freshman who 


Strong, Jr. 


planned to be a physician. He was unable to 
finance attendance at a medical school. Some 
years later he obtained one year’s work in a 
dental school. Since then he has been a 
dental technician. He is not a physician as 
he planned to be while in college. But he has 
not deviated so very far from his original plan, 
as far as interests go, since the interests of 
dental technicians must correlate above .50 
with-those of physicians. Family pressure has 
been another factor that prevented men from 
engaging in work in harmony with their 
interests. A few of the worst misfits are men 
who early expected to enter their family 
controlled business or profession and today 
have well-paid positions therein, but clearly 
indicate lack of interest in their work. 


Exceptions to the General Trend 
The trend is unmistakable that as engineer 
interest scores increase from 0 to 68 students 
choose occupations while in college and enter 
occupations 19 years later more and more 
closely related to engineering. : 
There are, however, individual exceptions t0 
the general trend. A total of 55 freshmen 
among 306 obtained an A rating (score of 4 
to 68) on engineer interest in 1930. See Table 
7. No information is available concerning the 
subsequent occupation of ten of these men- 
Among the 45 for whom information !§ 
available, 18 became engineers and 27 did not: 
What explanation can be given for the ar 
that only 40 per cent with A ratings becam 
engineers? 


Table 9 


Relationship of 1930 Engineer Score to Correlation Between Engineering and 1949 Occupation 


Correlation 


Per Cent 
Betwee Ti bas 
ae Mean Engineer Interest Total Ov 
_ Occupation : Examples of 1949 Occupation 1930 1931. “1939 i 1949 in 1930 
1.00 Engineer f 3 7 
Sener oOo 50.4 51.3 49.6 51.2 a 
a i a iy Chemist, Industrial Engineer 47.9 480 50.2 50.5 90 
= k en es Production Manager 34.4 35.7 30.4 37.5 ea 
a urchasing Agent, Statistician 25 i 17% 
= = S ae 5.3 218 268 25.0 
m A pi W on Accountant, Writer, Personnel 31.3 28.1 30.4 © 31.6 t 
— — sawyer, Sales Manager Retai 7 5 : 5 25. 1 
—.70 to —.78 Salesmen an Seng on ed be 25.8 25.9 a 
> 20.3 20.8 23.2 
Total, Non-Engineers 30.6 g“ 
ne E ee 99.6 28.7 325 32.2 aŠ 


* Only 9 cases ih 1930, 7 in 1931 8i 6 ae ae 
** Total is 218 in 1930, 182 in 1931, 14g 3> 29 1949, 


ae aa 


148 in 1939 and 1949, 


ee. EO ——————, 
See ee 


Vineteen-Vear Followup of Engineer Interests 


Table 10 


Occupations Actually Engaged in in 1949 by 45 Freshmen with a Rating in Engineer Interest Score and Occupations 
They Should Have Entered on Basis of Their Highest Occupational Interest Score 


Occupation 
Assigned on 


Correlation Basis of 
with Highest Score 
Engineer N 
1.00 Engineer 20.8 
88 Chemist 7.3 
63 Farmer 4 
52 Physician 3 
60 Production Mer. 3 
=A CRA 2 
58 Dentist 15 
50 Architect 1 
Al Printer 1 
AS President, Mfg. Co. 1 
—58 Advertiser 3 


The average man scored on 34 occupational 
Scales obtains about three A ratings. In other 
Words, the average man has the interests 
Peculiar to men successfully engaged in three 
different occupations. Consequently. the 
Chances are that among 45 men with A ratings 
M engineer interest only 15 of them would 
engage in engineering work. 

Actually these 45 men secured 217 A 
ratings, an average of 4.8 ratings each. Two 
reasons may be advanced for the unusual 
Number of A ratings. First, the sample was 
Selected on the basis of high scores in engineer 
terest. Men with high scores on any scale 
“te likely to have more high scores all told 
than men not so selected. Second, éngineer 
Scores correlate 40 and higher with 14 other 
Occupational scores in contrast to the average 
Scale which correlates to this extent with only 

Other scales, It is therefore to be expected 
at men with high engineer scores will 
verage more A ratings than the average man. 
_ TÉ the 45 men were distributed on a propor- 
tionate basis among the 217 A ratings we 
Would have 9.3 entering engineering, 7.0 
entering chemistry, 5.2 entering farming, 3-7 
entering production management, 3.3 enterig 
0, €dicine, and 16.5 entering twenty other 
i CUpations, On this basis we have 9.3 A 
dia “gineering in contrast to 18 who actu: y 

o 


Occupation 
Actually 
Engaged in 
in 1949 
N 


18 
+ (2 Geologists, 1 Physicist) 
3 

7 

(1 Industrial Engineer) 


(1 Dental Technician) 


(1 Owner of Moving Picture Business) 
(1 Writer) 


(6 miscellaneous, see text) 


The above calculations assume that all A 
ratings are equally significant. The writer 
has frequently maintained that this is true, 
that an occupational interest with a score of 
45 and another occupational interest with a 
score of 65 should both be carefully considered. 
There is a great deal of truth in this statement. 
Nevertheless, the data we have handled in our 
twenty year follow-up of Stanford students 
make clear that the higher the score the 
greater the likelihood the man will actually 


enter the occupation. 
If now we consider, not the 217 A ratings, 


but only the highest score each man received, 
what occupations will be assigned to the 45 
freshmen on that basis? The left hand half of 
Table 10 gives the answer. On this basis 
20.8 would be engineers in contrast to 18 who 
did become engineers. The right hand half 
of the table gives the actual distribution. 
Thirty-two men can be definitely called 
engineers, chemists, farmers, and physicians 
matching 35.1 men in the left hand distribution. 
In addition seven men are listed within 
parentheses as approximating fairly closely 
occupations on the left hand side. his gives 
good agreement between theoretical expecta- 
tion and actuality on the part of 39 among 
the 45 cases. 

The six remaining cases are: (1) a director 
of organizational planning who might be 


74 Edward K. Strong, Jr. 


likened to a production manager although 
another with the same title is neither an 
engineer or production manager by education 
or function (college major, engineering); (2) 
a clerk in a shipping firm in 1939 who chose 
business in 1930 and shipping in 1931 (major, 
economics); (3) a mail carrier who had 2} years 
of engineering and has held a miscellaneous 
assortment of jobs since then; (4) a partner in 
a small retail business (major, engineering); 
(5) a vice-president of a wholesale grocery 
owned by his family who chose this both 
freshman and sophomore years (major, his- 
tory); and (6) a salesman who has been in 
business activities since graduation (major, 
engineering). 

The actual occupational careers of the 45 
men and the expected careers on the above 
basis may be summarized by using the cor- 
relations between engineering and each of the 
occupations. Actually the men entered occu- 
pations correlating on the average .64 with 
engineering but if they had entered the occupa- 
tion on which they had their highest score the 
average correlation would have been .77. 

The reader may decide for himself whether or 
not the 45 men have entered occupations in 
reasonable agreement with their freshman 
interest scores. Considering the fallibility of 
tests today and all the factors that determine 
occupational choice which are independent, 
or largely so, of interests, such as health, 
ability, finance, and family pressure, the over- 
all agreement between engineer interest scores 
and choice of occupation is far greater than 
the writer would have anticipated. 


Summary 


Stanford University freshmen took the 


Vocational Interest Test as freshmen in 1930 
and the majority also took the test in 1931 
1939, and 1949. A fairly complete record of 
their education and occupational career was 
supplied on each of the four occasions. 

This summary is restricted to Scores on the 
engineer interest scale and the relationship 
of these scores to selection of college majors 
occupational choices when freshmen, and occu- 
pations engaged in in 1949, nineteen years later. 

The reliability of the engineer interest scale 
is .936. Permanency of scores is .91 for one 


year, .77 for nine years and .76 for nineteen 
years. 


Freshmen who became engineers had scores 
when freshmen that overlapped 99 per cent 
with the engineer criterion group, whereas 
those freshmen who became physicians had 
engineer scores that overlapped 48 per cent 
with the criterion group. Similarly, for 
lawyers, the overlapping was 16 per cent. 

The relationships of occupations to engineer- 
ing are expressed by the correlations between 
the interests of men in the various occupations 
and the interests of engineers. 

As the mean engineer interest scores of 
freshmen increase from 0 to 68 there result: 
(1) a progressive shift in college majors from 
languages to law, social sciences, business, 
biological sciences, physical sciences and engl- 
neering; (2) a progressive shift in freshman 
occupation choice from occupations correlating 
on the average of —.44 with engineering tO 
occupations correlating .97 with engineering 
and a correlation between the two measures of 
-66; and (3) a progressive shift in occupations 
engaged in nineteen years later which correlate 
—.71 with engineering to engineering itself, 
and a correlation between the two measures 
of .55. 

Whatever an interest test measures, whether 
interests, preferences, values, goals, or what 
have you, it measures something very stable 
and permanently possessed and something 
that contributes very greatly to occupational 
choice. 


Received November 16, 1951. 
Early publication. 


References 


1. Burnham, P.S. Stability of interests. Sch. & 50 
1942, 55, 333. in 

- Glass, S. S. An investigational analysis a e 
general and specific interests of engineering a 
dents. Ph.D. Thesis, Purdue University Library’ 
1934. d 

- Strong, E. K., Jr. Vocational interests of men o 
women. Stanford, California: Stanford Univ 
sity Press, 1943, 

4. Strong, E. K., Jr. Manual for Vocational i rd 
Blank for Men. Stanford, California: Stan!? 
University Press, 1951. 5 

5. Strong, E. K., Jr, Permanence of interest score 
over 22 years. J, appl. Psychol., 1951, 35, 897 y 

6. Strong, E. K., Jr. Amount of change in occuP 
tional choice of college freshmen. In press j- 

- Van Dusen, A. C, Permanence of vocational int®! 

ests. J. educ. Psychol., 1940, 31, 401-24. 


resh 


-i 


Academic Achievement and Strong Occupational Level Scores * 


John W. Gustad 


Vanderbilt University 


Following the recognition of non-intellectual 
factors in academic achievement, increasingly, 
N recent years, attempts have been made to 
find adequate measures of motivation by 
means of which to improve the efficiency of 
the prediction of scholastic success. Because 
of the method of construction involved, the 
Occupational Level score (henceforth to be 
referred to as OL) of the Strong Vocational 
Interest Blank for men has been suggested as 
a promising approach. Strong (7) developed 
this scale by contrasting the interests of high 
Status, professional and business men and 
laboring men. Darley (2, p. 60) has described 
this variable as follows: “. . - a quantitative 
Statement of the eventual adult ‘level of 
4spiration,’ represents the degree to which the 
individual’s total background has prepared 
um to seek the prestige and discharge the 
Social responsibilities growing out of high 
come, professional status, recognition Or 
cadership in the community.” Further, 

arley (2, p. 66) suggests that, eee? 
&xcessively low occupational level score seems 

Present to be associated with lack of 
Staying power’ or ‘survival power’ in college 
Competition.” 

everal research attempts have be 
9 assess the usefulness of p 

» D: 201) studied a group of men * 
Graduate Fa of ee Administration 

Stanford, dividing his subjects into four 

“gtoups in terms of grades for one hee 
the then compared the mean OL ae a 
~de upper and lower groups and foun 


ing... ; 
Nsignificant difference of one point. He alse 


t 


en made 
Strong 
jn the 


an Puted the correlation i oS 
9 ‘ning a coefucient OF - 
OL scores, obtaining ey ne 


a nsidering the restricted ran, 
Ong graduate students, howev 
, this is not unusual. 


* 
The Present study was condu! 


fy 
nds i egie 
X rop oVided by the Camy pene 


movement of Teaching, aid | 
Y wishes to express his grati 


er, as well as 
n 
: id of 

cted with the al th 
k ion for the 
Fount Ae writer 


tude. 


75 


Berdie (1), studying engineering freshmen 
with regard to both college satisfaction, and 
scholastic achievement, found that scores on 
the Engineer’s key of the Strong correlated 
only .10 with satisfaction and .13 with grades. 
OL scores correlated .01 with satisfaction and 
.03 with grades. He further found no relation- 
ship between intensity of engineering interests 
and grades. 

Kendall (3), having sorted his men subjects 
into three groups in terms of OL scores, made 
an analysis of co-variance, testing the signif- 
icance of the differences between mean grade 
point averages among the groups with ac- 
ademic aptitude, measured by the Ohio State 
Psychological Examination, held constant. 
He found the variance ratio to be significant 
at between the .05 and .01 probability levels. 
He concluded that, “If used with caution, 
OL scores at the extremes of the distribution 
should be helpful to the counselor in making 
judgments concerning individual cases for 
scholastic success.” 

Ostrom (6, 5) has reported two studies of 
the same sort, one with twelfth grade boys, 
the other with college freshman men. In the 
first, he devised three measures of drive in 
addition to OL: an interview, a teacher rating, 
and a “Guess Who” questionnaire. These 
three all were significantly related to OL, 
giving further indication of its nature. Making 
an analysis of co-varlance similar to that of 
Kendall, he found no significant differences in 
mean achievement, with academic aptitude 
held constant, between groups differing in 
OL scores. In his second study, with college 
freshmen, he set up six groups in terms of 
both OL and scholastic aptitude scores. 
Making an analysis of variance of the three- 
by-two table, he found academic success to 
be related to both aptitude and OL. 


Purpose 


The present study was undertaken for two 
reasons: first, to see whether, at the senior 


76 


where occupational choices are 
set in terms of major courses, 
OL predicts differential success; second, to 
allow for the effect of appropriate or inappro- 
priate vocational choice, judged in terms of 
profiles on the Strong. This latter point, it 
seemed, was particularly important. The 
question seemed to be if it was realistic to 
expect a student to channel his basic motiva- 
tion, as measured by OL, into studies if these 
were not related in turn to his occupational 
interests. 


college level 
most clearly 


Method 


At the beginning of the winter quarter, 
1950, Strong Vocational Interest Blanks were 
filled out by the junior men. Juniors were 
chosen because they would be making voca- 
tional choices implemented by the selection 
of a major, and because, by the junior year, 
there was greater likelihood that the vocational 


John W. 


Guslad 


interest patterns would have matured. . ei 
on the ACE Psychological Examination an 
quality point ratios (QPR) were obtained 
from the files of the University Counseling 
Service and the Registrar. These were avail- 
able for 134 men. All were students in the 
College of Arts and Sciences. ; 

The interest profiles were first examined to 
determine whether they were appropriate m 
terms of the students’ major choices. The 
system used was that outlined by Darley (2). 
In making the judgment of appropriateness, 
either a primary in the proper interest area OT 
if the student had no primary, a secondary 
pattern in the area was accepted as indica gag 
an appropriate choice. Seventy-four per cel > 
of the group had appropriate choices, 26 P 
cent inappropriate. Next, the appropria 
choice cases were further subdivided agorde 
to major field. There were three groups: ( ) 
those majoring in Business Administration; 


Table 1 


Analyses of Variance, with Co-Variance Adjustments for Academic Aptitude, 
for the Several Academic Groups Separated According to O 


of Scholastic Achievement 


L Scores pee” 
Sumof Sum of Adjusted 
; Sä.: Sq.: Sum of Sumof Mean ision 
Group Source df QPR) OL(Y) Products df ON Sewe r aes 
_ Between 2 584 54.95 2  0I7 0085 p284 Accept 
Bus. Admin. Within 45 15.130 174.94 45 13.479 .29095 
Total 47 15.714 23,846.41 229.89 47 13.498 
Between 2 064 132.08 46 2 -177 089 3560 Accept 
Pre-med. Within 23 8.160 8,557.96 132.64 22 6.004 .250 
Total 25 8.224 8,690.04 133.10 24 6.181 
Miscellaneous Between 2 .280 375.0 6.47 2 187 093 .299 Accept 
Within 21 7.738 15,543.0 153.17 20 6.229 „311 
Total 23 8.018 15,918.0 159.64 2 6.416 
Total: Between 2 340 1,865 5 geep! 
z re : 805.34 25.22 2 008 004 138 Ac 
Appropriate Within 96 33.600 50,374.66 2,466.01 95 278 029 
Total 98 33.840 52,240.00 2,491.23 97 g -286 
Between 2 224 82 ept 
4 uss 2 4.64 29 x AG 
Teppopiats Witi aa ama 31 wins a = 
3 eo R maa = ` 229 
5 Total 3 
o! 4 10.366 13,841.14 22.83 33 10.328 
Between 2 10.165 5 ject 
Total Within 131 sett sens saag q 10006 soog 2233 Rel 
hia = Sat 077. 585.23 130 29.122 .224 
a Bat 33 44.666 66,082.96 604.96 132 


ae 


Academic Achievement and Strong Occupational Level Scores 


(2) those following the pre-medical curriculum; 
and (3) a miscellaneous group. In all, there 
were then four groups: three appropriate, one 
Mappropriate. 

Following this, a distribution of OL standard 
Scores was made for all 134 students and then 
divided into thirds as nearly as possible. 
Each of the four groups was then further 
subdivided according to the following scheme: 
(1) high (OL=56); (2) middle (OL 52-56); 
(3) low (OL=51). 

Finally, analyses of co-variance were con- 
ducted, following the method outlined by 
McNemar (4, Ch. 15), one for each of the 
four groups separately, one for all appropriate 
choice cases, and then for all cases combined. 
In this, the null hypothesis was that there were 
no differences in mean quality point ratios 
between the three OL groups when academic 
aptitude, measured by the ACE, was held 
Constant. 


Results and Discussion 


The results of the foregoing analyses are 
Summarized in Table 1. For all groups except 
the total, the F tests were not significant. 

Onsequently, it was concluded that OL, at 
this level, was not a significant predictor ot 
Scholastic success within advanced major fields, 
even when the students were pursuing curricula 
*Ppropriate to their measured interests. 

For the total group, however, including 
both the appropriate and inappropriate cases, 
Ne F test was significant. From this, it 
®Ppeared that the earlier studies were con- 
med but only in a limited sense. For the 


otal group, io for the unadjusted 


5 the F rati ela 
duality point ratios was also significant 


(F< 19.39; d.f.=2 and 131; P<.01). 
hile the results for the separate groups 
were not significant, it was felt that differences 
a scores among the groups might have 
contributed to the significant F test for the 
atal group. To check this, an analysis of 
Variance was made of the OL scores of the 
e groups. This is summarized in Table 2. 
€ conclusion was to accept the null hypothesis 
at there were no differences between the 
i groups on OL. It would seem, therefore, 
that differences in OL scores had not accounted 
Or the significant results obtained for the 


otal group analysis. 


~ 
N 


Table 2 


Analysis of Variance Testing the Differences in Mean 
OL Scores of the Academic Groups 


Sums of Mean 
Source Squares d.f. Squares F Decision 
Between 6526 2 3263 131 Accept 
Within 3,252.89 131 24.83 
Total 3,318.15 133 


It remains, then, to account for the obtained 
results and also to try to infer the guidance 
significance of these. As far as the results for 
the total group are concerned, they agree with 
earlier studies; it is in the major groups that 
the differences appear. The first and most 
likely explanation which suggests itself is 
restriction of range on all three variables: 
OL, grades, and ACE scores. College men 
all probably have relatively high OL scores; 
added to this is the effect of selective elimina- 
tion and survival which, by the junior year, has 
left only the more able students, both as to 
ability and achievement. Even among those 
in curricula for which their interest patterns 
are inappropriate, the differences in achieve- 
ment between OL groups was insignificant. 
Yet, when this group was pooled with the 
appropriate groups, the Total F test was 
significant. Reference to Table 1 will show 
that, since the F for the total appropriate 
group was insignificant while that for the 
total group was significant, it would seem that 
the addition of the inappropriate cases resulted 
in the significant difference. In some way, 
difficult at present to explain, the inappro- 
priate cases appeared to differ from other 
students. 

However, a dean, considering applicants for 
senior college, could use OL scores with caution, 
especially if he took other factors into account, 
to predict success, since in most such groups 
there will be students with both appropriate 
and inappropriate choices. A counselor, on 
the other hand, concerned with an individual 
student and knowing whether the interest 

attern and curricular choice were*in line, 
would probably not find the OL score partic- 
ularly useful. The one exception might be the 
case of the student whose OL score was very 
low, approximating that of unskilled or semi- 


78 John W. Gustad 


skilled workers. Yet, in the present study, 42 
cases (low OL group) had OL scores at or 
below the point (standard score 51) which 
Strong (7, p. 196-197) has indicated as being 
subprofessional. : i 
There seem to be two research designs which 
might be useful in avoiding the restricted range 
problem. First, one might use as subjects a 
group of freshmen engineers, since they are 
following from the outset a curriculum pre- 
sumably near their interests. The study by 
Berdie (1), however, yielded insignificant 
results, although he did not hold academic 
aptitude constant. The other approach would 
make use of the longitudinal design so that a 


measure of the restriction of range might be 
obtained and used. 


Summary and Conclusions 


Having studied the relationship between 
OL scores and college grades, with scholastic 
ability held constant, among a group of junior 
Arts college men, the following conclusions 
seemed to be warranted: 

1. Considering each of the groups separately 
(major groups, inappropriate group, total 
appropriate group), there were no differences 
in grades between groups separated in terms 
of OL scores. 

2. For the total group, appropriate and 
inappropriate cases pooled, there were signif- 
icant differences between scholastic achieve- 
ment means of the OL groups. 


3. No differences were found between the 
major groups in terms of OL scores. . 

4. Restriction of range on all three variables 
was suggested as the most likely explanation 
of the negative findings. , — 

5: The conclusions of previous studies w a 
partially supported, but the findings appear > 
have more potential value in selection than 1 
guidance. ; i 

6. Two research designs were suggested iy 
means of which it might be possible to avo! 
the restricted range problem. 


Received May 7, 1951. 


References 


1. Berdie, R. F. The prediction of college achievemeni 
and satisfaction. J. appl. Psychol., 1944, 2% 
239-245, 

2. Darley, J. G. Clinical aspects and inter pretation d 
the Strong Vocational Interest Blank. New York: 
The Psychological Corporation, 1941. oa 

3. Kendall, W. E. The occupational level key of 
Strong Vocational Interest Blank for Men. 
appl. Psychol., 1947, 31, 283-287. PE 

4. McNemar, Q. Psychological statistics. New Yor 
John Wiley and Sons, 1949, inal 

5. Ostrom, S. R. The OL key of the Strong vemen 

Interest Blank for Men and scholastic sueca 
the college freshman level. J. appl. Psychol 
1949, 33, 51-54. jä 
. Ostrom, S. R. The OL key of the Strong test a 
drive at the twelfth grade level, J. appl. Psy 
chol., 1949, 33, 241-248. wi 
- Strong, E. K., Jr. Vocational interests of men 4 


“ia s 
women. Stanford: Stanford University Pres* 
1943, 


—— j — 


Interest Item Response Arrangement As It Affects Discrimination 
Between Professional Groups * 


John V. Zuckerman 


Human Resources Research Office, The George Washington University 


An important aspect of interest measurement 
methodology is the question of how much 
different item arrangements contribute to 
discrimination between various groups. This 
Problem is of particular significance when the 
groups to be distinguished are quite similar 
ìn their work and interests, as, for example, 
Specialty groups within a single profession. 
It is reasonable to suppose that some particular 
item form might be more effective than another 
in “squeezing out” such small differences as 
would be assumed to exist. 

This study concerns an effort to determine 
the relative merits of two types of interest 
item response arrangement in discriminating 
among the interests of professional groups. 

he research was one phase of a project on 
the development of an interest instrument 
intended for medical specialists. 


Interest Inventory Item Arrangement 


. There are two methods of arranging interest 
items in general use. L-I-D (like-indifferent- 
dislike) items permit the choice of a response 
among a graded series of attitudes toward a 
Statement, Forced-choice items require the 
Selection of one or more alternative statements 
Over another or others. 

The Strong Vocational Interest Blank. one of 
the two best-known interest inventories, uses 
many L-I-D and similar items permitting a 
choice among responses (320 out of 400 items). 

he Kuder Preference Record—V' ocational, 
Consists of 168 triadic items, requiring forced 
Choices of the best- and least-liked statements 
M each group of three. 

ne comparison of the tw 
Ments may be made with respec 


o item arrange- 
t to the number 


* Thi t Stan- 
This article is based on research completed a 
prd University, while the author was a member of the 


+ nt of 
petical Specialists Research Project, a aa 


‘Ychology, and represents a portion of a 
pitted in partial fulfillment of the requirements ae 
‘D. degree at Stanford. The research was spo! 


Ey the Surgeon General, U. S. Army. 


79 


of possible scoring weights available for any 
given statement. When two groups are 
contrasted in interests, a single L-I-D item 
can be given as many as three scoring weights, 
since any three of the six percentages in the 
response table which is produced can be 
changed independently of the others. For 
forced-choice items which use pairs of state- 
ments, but one weight can be secured for each 
two statements. 

Thus an advantage of the L-I-D item form 
is that it is theoretically possible to obtain 
more weights from a given number of state- 
ments in a given physical space than from 
a forced-choice arrangement using pairs of 
statements. Such a forced-choice arrange- 
ment would require a much longer inventory, 
taking more time to administer, if it were to 
equal the L-I-D form in number of weights. 

A possible advantage of the forced-choice 
item form has been brought out in a recent 
critique by Cronbach (1), who suggests that 
such item forms as L-I-D and Yes-No-? give 
rise to the possibility of responses not at all 
related to what the tests are designed to 
measure. These he terms “response sets.” 
Examples of response sets are answering “like” 
to all items on an interest inventory regardless 
of content, or using only the “like” and “‘indif- 
ferent” categories of response because of a habit 
not to dislike anything, or because of a special 
personal definition of disliking. Cronbach (2) 
further contends that such sets reduce test 
validity by introducing extraneous variance, 
and he states that the sets can be eliminated 
by the use of item forms requiring a choice 
among alternative responses, rather than the 
expression of attitudes toward a single state- 
ment. 

Evidence of a quantitative nature favors the 
L-LD item form, since it has been shown to 
differentiate occupations and has been demon- 
strated to be reliable and valid for vocational 
guidance for some twenty years. However, 


John V. Zuckerman 
80 4 


the theoretical points raised concerning the 
possible additional discrimination provided by 
forced-choice forms provide sufficient reason 
for investigating the relative merits of the 
item arrangements, especially when discrimina- 
tion between similar groups is considered. 
Therefore the question was raised: In an 
interest inventory, which item arrangement 
provides more discrimination between profes- 
sional groups, forced-choice or L-I-D? 


Selection of Vocational Groups for Study 


Because the education profession contains 
well-defined specialty groups of considerable 
size, it was chosen for study. In addition, 
since engineers are known to differ considerably 
from teachers in their interests (Strong, 7) a 
group of electrical engineers was selected to 
contrast in interests with educators. 

It was hypothesized that there would be 
little difference between the two item forms 
for making the “easy” discrimination between 
the interests of educators and electrical 
engineers, but that for the “difficult” dis- 
crimination between sub-groups within educa- 
tion, forced-choice item forms would have an 
advantage because of a tendency to “squeeze 
out” differences of small size. 


Plan of the Study 


Two inventories, differing only in item 
arrangement, were administered to the same 
individuals, in several different professional 
groups, using a counter-balanced order of 
testing in order to control as many sources of 
error variance as possible. 

The responses of members 


; of criterion 
groups were subjected to i 


tem analysis, and 


ups and comparisons 


of the discrimination between groups made 


from form to form. 


Procedure 
Interest T nventory Construction. 

of the characteristics of certain 
sub-groups provided the hypothes 
differed in interests relating to ti 
working functions. Ina prelimi 
medical specialty groups diffe 
functions of internists, pathol 


An analysis 
professional 
is that they 
heir differing 
nary study of 
rences in the 
ogists, psychi- 


atrists and surgeons were noted. Four on 
modes of functioning were named and describec 
as follows: 


Analytic: Preferences for problem-solv- 


ing, theorizing, reasoning. 


Visual: Preferences for using visual 
symbols, as in reading or 
map-reading. , 

Social: Liking for working with 


. people. . 
Manipulative: Tool-using preferences; liking 
for sports or operating ma- 
chinery. 


The four modes were to be measured by 
interest items in inventories consisting & 
descriptions of occupational and avocationa 
activities. 

An interest inventory was made up of uis 
paired-comparison forced-choice items baat 
upon the functional scheme just mentioned 
Data secured from a pretest on 117 college 
sophomores and 144 U. S. Army medical offi- 
cers were used to refine the modal scales, and 
provided a basis for the construction of a second, 
more refined interest instrument. P 

Occupational descriptions from the DOT 
(8) and activity items from SVIB (1) were 
rewritten and others of a similar nature devised. 
A large number of these was submitted tO 
six judges who classified the items with refer- 
ence to the modes of preference. ” 

The judges selected 65 occupational descrip- 
tions and 54 activity items as unambiguous: 
Of these, 60 occupational items and 52 activity 
descriptions were chosen by the author ans 
another psychologist and grouped in a 
Each group of four contained statements 
representing each of the four modes of dealing 
with the environment. Thus there were 
obtained 15 occupational description clusters, 
and 13 activity clusters. Within each cluster 
the items were equated for social prestige» 
intelligence, education and skills required for 
the activities, which it seemed subjectively 
desirable to hold constant within the grouP® 
of four statements, 

Although care was taken to hold the state- 


ment groups equal for social prestige ° 


A Be ace . d 
occupations or activities, it was not considere 
Serious if some errors were made, since 


recent study by Fehrer and Strupp (3) ba 


Interest Item Response Arrangement 81 


shown that it makes little difference in 
responses if interest items vary in this manner. 

The clusters of items were arranged in a 
random order (separately for occupations and 
activities) and then pairs of statements were 
drawn at random from each cluster and 
arranged on a test form as A-B forced-choice 
items. This procedure was continued until 
all possible pairs (six for each cluster) had 
been formed. 

The resulting forced-choice interest inven- 
tory contained 90 occupational items and 78 
activity items, 168 in all. 

An equivalent L-I-D inventory form was 
produced by shuffling the single statements 
(occupational and activity items were treated 
Separately). The form contained 112 items, 
60 occupations and 52 activities. 

The inventory was titled the Occupational 
and Activity Preference Blank from the nature 
of the items, the two forms being identified as 
Form FE (forced-choice) and Form OE (open- 
ended, or L-I-D). Instructions for self-ad- 
ministration using electrically scored answer 
sheets were prepared. PE 

Subjects. The educational profession is 
divided into two relatively distinct sub-groups 
with specific entrance requirements. Teachers 
constitute the bulk of the profession (9), while 
Supervisors and administrators, including 
principals, vice-principals and superintendents, 
make up the balance. Guidance workers, 
while negligible in percentage in the profession 
as a whole, are represented in sizable numbers 
in the training programs. Subjects in those 
specialties were selected for testing by visiting 
every class in education during a term at 
Stanford University which had more than 
100 students. Both men and women were 
tested, although only men were included in 
criterion groups. 

In addition to the educational specialists, 
a group of electrical engineering students in a 
8taduate seminar was tested. 

Test Administration. Each group visited 
Was given instructions concerning the purpose 
Of the test, which was described as an evalua- 
tion of professional interests. All the subjects 
Were asked to fill out a vocational information 

ank, data from which were used later to 
Select criterion group members. 

Forms FE and OE of the OAPB were then 


passed to the subjects. These were marked 
so that half the individuals at random in each 
group received instructions to begin Form FE 
first and the balance were instructed to start 
with Form OE. No time limit was assigned 
for the completion of the blanks, but instruc- 
tions were given to work as rapidly as possible. 

One group was carefully timed. The timing 
for 36 individuals completing both inventories 
showed a median time of 24.5 minutes required 
for Form FE, while Form OE, physically about 
45 per cent the length of the other, required 
a median time of 12.6 minutes to complete. 

About 430 men and women students in 
education and 98 electrical engineering stu- 
dents were tested. Four hundred and eighteen 
completed pairs of blanks were secured from 
educators, and 94 pairs from electrical engi- 
neers. Not all the education students were 
included in criterion groups. The extra blanks 
were used in a reliability study of the interest 
scales which were developed. 


Treatment of the Data 


Composition of Criterion Groups. ‘Three 
groups in education were defined: educa- 
tion students-in-general, administrators and 
teachers. All members were male, between 
21 and 55 years of age. Education students- 
in-general included 50 per cent preparing for 
careers in administration, 30 per cent for 
teaching and 20 per cent were guidance 
students, representative proportions of male 
students at Stanford. The term “administra- 
tors” was chosen to cover students prepar- 
ing for supervision and for administration. 
Members of this group were required to have 
three or more years of experience in education 
and to be in the specialty group at the time 
tested. Teachers were required to meet the 
same criteria. Both teacher and administrator 
groups contained only those individuals who 
expressed a desire to remain in their specialty 
group. 

The électrical engineering student criterion 
group consisted of men ranging from 21 to 55 
years of age, all of whom were committed 
to careers in electrical engineering and approved 
for advanced training by their department 
head. 

Construction of Occupational Interest Scales. 
For each of the criterion groups, item analysis 


82 


data were secured, and interest scales were 
prepared for both inventory forms. R 

The scale system used was an adaptation of 
the method employed by Strong (7), in which 
interest data are weighted in terms of the 
differences between proportions of responses 
for two criterion groups. The datum from 
which differences are measured is termed by 
Strong the “point of reference” and the amount 
of differentiation between any two groups is 
in part a function of the point of reference 
chosen. 

For educational comparisons, the first point 
of reference used was interests of the education 
students-in-general group (N=150). Admin- 
istrators’ and teachers’ interests were each 
differentiated from these (scales were named 
“ADMINISTRATOR” and “TEACHER”). 
Because the latter groups were small (admin- 
istrators, N= 56; teachers, N=41) a comparison 
was made directly between the two groups, 
using administrator interests as a point of 
reference (this scale was labeled ‘“ADMIN- 
ISTRATOR-TEACHER” scale). 

The point of reference for the comparison 
of educator interests with those of electrical 
engineers was the education student-in-general 
group. The engineer group was medium 
sized (N=94), 

Strong's weighting table requires criterion 
groups of at least 100, so his system was not 
used directly. Strong has shown (7) that 
other methods yield about the same results as 
his own. One of these is a scheme developed 
by Guilford for securing item weights (4) 
which is usable both for forced 
L-I-D items, 

The Guilford method weights each item from 
zero to plus or minus four, in accordance with 
a formula which takes into account both the 
magnitude of differences and the amount of 
confidence one has that they represent true 
differences. 

T By means of the Guilford system, scoring 

eys were produced for the four comparisons 
made for each form of the OAPB. Weights 
ofamore than unity were used only in the 
contrast of the interests 


l l of educators and 
electrical engineers, where Some weights of 
two and three were employed. 


The answer blanks of the criteri 
were scored for each scale 


-choice and 


‘eron groups 
applicable to each 


John V. Zuckerman 


group, and blanks for 171 men and moneh 
not in criterion groups were scored for a 
scales to provide reliability information. 


Results 


From the scores of the criterion group 
members, means, sigmas and standard errors 
of the means were computed. Differences 
between mean scores were evaluated for 
significance. Table 1, below, presents the 


Table 1 


Differences Between Mean Scores of Professional 
Groups on Four Interest Scales of Form FE 
and Form OF, OAPB 


Scale 
Name Dy D 


Groups Contrasted 
Form FE 
Educators-in-General, 

Electrical Engineers 
Administrators, 
Teachers 
Administrators, 
‘Teachers 
Administrators, 
Teachers 


Ed-Eng 40.0 2.3 
Adm 
Tea 


Ad-Tea 


Form OF 
Mducators-in-General, 
Electrical Engineers 
Administrators, 
Teachers 
Administrators, 
Teachers 
Administrators, 
Teachers 


Kd-Eng 
Adm 
Tea 


Ad-Tea 110 14 78 


ae 


tye . . pra ; i e 
* All critical ratios are significant at or beyond th 
.001 level of confidence. 


mean differences, which can be seen to be 
highly significant for each comparison. 5 
To evaluate the differences in the discrimina- 
tion-produced by the two different item forms: 
a statistic which would take into account 
both central tendency and spread was em” 
ployed. The measure was devised by defining 
a measure of area common to two distributiors 
proportion of overlapping. This was take” 
as the proportion of scores of one grou? 
falling in the region between the tail of PE 
distribution and an ordinate raised at half th? 
sigma distance between the means of the t¥? 


Interesi Item Response Arrangement 83 


distributions. This statistic involves the 
assumption that the two distributions of 
scores are normal, and that the scores are 
obtained with the same measuring device. 
Proportions of overlapping were calculated 
for each discrimination made with each form 
of the OAPB as follows. The difference 
between each pair of raw mean scores (for the 
same discrimination) was divided by twice 
the average standard deviation of the two 
distributions, thus locating an ordinate half- 
way between the means. The standard score 
value for this cutting point ordinate was 
converted to a raw score value for the distribu- 
tion with the larger N. In the cases where 
this distribution had the /ower raw mean value, 
all scores between the highest and the cutting 
point were tallied. In the cases where the 
distribution chosen for computation possessed 
the higher mean value, the scores between the 
lowest score and the cutting point were tallied. 


Table 2 
Differences Between Proportions of Overlapping for 


Two Forms of the OAPB, on Four 
Interest Comparisons 


Proportion 
of Overlap 
(i) B 83) 
Form Form Dp. 
Groups Contrasted FE OF Ov’lp op C.R. 
Educators-in-General, 
Electrical Engineers .15 44 01 03 33 
Administrators, 
Teachers .23 20 03 07 A2 
(Administrator 
Scale) 
Administrators, 
Teachers 34 ot 0 2 = 


(Teacher Scale) 
Administrators, 
Teachers 20 20 
(Administrator- 
Teacher Scale) 


0 06 = 


ie of Form OE, 


(1) Positive differences are in fav i 
o 


that is, Form OE provides the smaller proportion 
Overlapping. 

(2) Standard errors of the differences co 
Using McNemar’s formula 28a (6) which 
account the correlational factor due to use © 
Subjects for both test forms. 

(3) None of the differences is significant. 


mputed by 
takes into 
f the same 


Table 3 


Product-Moment Reliabilities for Four Occupational 
Interest Scales of Form FE and 
Form OE, OAPB* 


Scale r ro** 
Form FE 
Ed-Eng 75 86 
Adm 42 59 
Tea 10 19 
Ad-Tea 51 68 
Form OF 
Ed-Eng 75 86 
Adm 46 .60 
Tea 32 AS 
Ad-Tea 250 67 


* Calculated from scores of 171 education students 
not in criterion groups. 

** Corrected by Spearman-Brown prophecy formula 
for test length. 


The tallies were each converted into proportions 
of the chosen distribution, the proportions of 
overlapping. Actually, if the two distributions 
were normal, with equal sigmas, the true 
proportion of overlapping scores would be 
equal to twice the proportion of overlapping. 
Differences between the proportions of 
overlapping obtained for the two different 
forms of the OAPB were computed, and 
standard errors of the differences obtained 
(McNemar’s formula 28a (6) was used), 
These data are presented in Table 2, below. 
The reliabilities for the scales developed 
for both forms of the OAPB were obtained. 
These are presented in Table 3, and it may 
be noted that they are quite comparable from 
form to form, with one exception. The value 
for the TEACHER scale for Form FE is 
considerably lower than that for the L-I-D 
form, Form OE. This may have been due to 
the small size of the criterion group used for 
securing the scale weights (N=41) and to a 
scale with comparatively few weights (33). 


Discussion 


The results of this investigatio“ were 
clear-cut. For each comparison, from the 
discrimination of the interests of electrical 
engineering students from those of education 
students-in-general to the separation of the 
interests of teachers from those of education 


84 


i the two inventories used 
ee P identical manner. 

Pee oh of the results, however, is 
dependent upon a number of contingent 
factors. There are two Classes of these, the 
first being those limitations imposed by the 
experimental design, and the second kind 
differences inherent in the item forms used. 

The experiment was restricted to professional 
people, who were presumably not motivated 
to mislead the experimenter or to fake their 
scores. The interest inventories were con- 
structed to be understandable to the subjects, 
so that they should not have had any tendency 
to respond in a manner unrelated to what the 
inventories were intended to measure. It is 
not known what would have occurred had the 
blanks been ambiguous, or too difficult for 
the respondents, or had the situation been 
one to induce faking. In those cases the 
discriminations obtained with the two item 
forms might have been quite different. 

Another limitation in the experimental 
design was the arbitrary method of selecting 
the statements to be linked in the forced-choice 
test items, and the use of pairs as the units 
of comparison. Forced-choice items may be 
constructed with more than two alternatives, 
and Kuder (5) states that his triadic items are 
as reliable as paired-comparison items. 

Those differences due to the item forms can 
be evaluated quantitatively, and must be 
considered in interpreting the research results. 
It has been already indicated in the section 
on interest item arrangement that each L-I-D 
type statement could provide a maximum of 
three response positions to be weighted when 
two groups are compared. Forced-choice 
items using paired-comparisons can provide at 


most only one weight for each two statements. 
The items used in the 


forced-choice F F 
are twice the lengths 5. Sane a 


of the L-I-D it i 
Form OE of the OAPB, Tf all the hay ene 


on a forced-choice form were weighted on a 
given interest scale, the form would require 
twice the administration time that an L-I-D 
fcîm with the same number of items would 
need. If the L-I-D form were Weighted in all 
Possible positions, the A-B form would provide 
only one-third the numb 


er of weights th 
open-ended form would yield. 5 atine 


us, an L-I- 
form could conceivably be one sixth the ated 


John V. Zuckerman 


of a paired-comparison test form, and yield the 
same number of weights (no consideration is 
given here to the relative sizes of the weights; 
in this study the L-I-D form provided a greater 
range of weights than the forced-choice form). 

Since each statement represented in Form 
OE appeared three times in Form FE, ie 
OE was physically about 45 per cent the 
length of the other. Form OE also took only 
half the time to administer, yet produced the 
same total discrimination in terms of over- 


lapping of interests and scales of the same 
reliability. 


Summary and Conclusions 


An important problem in interest measure- 
ment concerns the relative effectiveness of 
different item response arrangements in dis- 
criminating among the interests of professional 
groups. 

In this study an interest inventory was 
designed in two comparable forms, one using 
L-I-D items and the other using forced-choice 
paired-comparisons, to discriminate between 
professional groups, and the relative merits of 
the two forms were assessed. 

Based upon the resultant discrimination pe 
unit item length and unit time required for 
administration of the two forms, it is con- 
cluded that L-I-D test item arrangement 19 
this study is clearly superior to forced-choice: 
Cronbach’s criticism of this item type seems 
not well-founded, in terms of its performance 
in discriminating the interests of professiona 
groups. The hypothesis which was offere 
about the superiority of the forced-choice item 
form for discriminating between subgroup 
within a single profession was not upheld. | 

The study was limited to professiona 
persons who were not motivated to fake an 
who presumably understood the item contents: 
Also, the items were selected in accordance 
with a functional scheme which imposed its 
limitations on the results, In addition, oY 
pairs of alternatives were used in making UP 
the forced-choice items, Tt is not known wh@ 
would have occurred had triadic items bee? 
used in the forced-choice form, Further 


3 FNG ion 
investigation is necessary to secure informati? 
on these points, 


Received M. ay 25, 1951. 


Interest Item Response Arrangement 85 


References 


ie Cronbach, L. J. Response sets and test validity. 
Educ. Psychol, Measmt., 1946, 6, 475-493. 

2. Cronbach, L. J. Further evidence on response sets 
and test design. Educ. Psychol. Measmt., 1950, 
10, 3-31. 

Fehrer, Elizabeth, and Strupp, H. The effect of 
equating interest test items for prestige value. 
J. appl. Psychol., 1949, 33, 222-230. 

Guilford, J. P. A simple scoring weight for test 
items and its reliability. Psychometrika, 1941, 
9, 67-81. 

« Kuder, G, F. Examiner manual for the Kuder Pref- 


p 


Cag 


n 


erence Record—Vocational. Chicago: Science Re- 
search Associates, 1949. 

6. McNemar, Q. Psychological statistics. 
Wiley, 1949. 

7. Strong, E. K., Jr. Vocational interests of men and 
women. Stanford: Stanford University Press, 
1943. 

8. U. S. Employment Service. Dictionary of Occupa- 
tional Titles, Part I. Washington, D. C.: U. S. 
Government Printing Office, 1939. 

9. U. S. Office of Education. Biennial Review of Edu- 
cation. Washington, D. C.: U. S. Government 
Printing Office, 1946. 


New York: 


Communication, Supervision, and Morale 


C. G. Browne 
Wayne University 
and 
Betty J. Neitzel 
National Bank of Detroit 


This study was concerned with the estima- 
tion and communication of responsibility, 
authority, and delegation of authority by 
three supervisory levels of female employees 
in a utilities company. Comparisons will be 
made between the communication of the 
three factors and the attitudes of the super- 
visory employees toward company personnel 
policies. 

While in many cases management may 
believe that it has established specific respon- 
sibilities and authorities for given positions. 
and that they are co-equal, it is important to 
determine whether or not all levels of the 
organization have communicated the thinking 
of management in an understandable and 
acceptable manner. Communication isa proc- 
ess that takes place throughout the entire 
organization between all individuals and 
departments, in a flow both inwardly and 
outwardly through all echelons.! 


? 


Procedure 

The subjects for this study 
female employees at three su 
selected from eight offices 
utilities company. The th 
levels will be designated A 
purposes of this report. 
inner level of the three and functioned in a 
supervisory capacity to level B; level B 
supervised level C; and level C supervised a 
non-supervisory level not included jn the 
study. An office from two districts of each 
of the four divisions of the company was 
included. Districts 1 and 2 represented 
Division I; Districts 3 and 4, Division II: 
Districts 5 and 6, Division TI; and Districts 
7 and 8, Division IV. 


were a group of 
pervisory levels 
of a Michigan 
ree supervisory 
» B, and C for 
Level A was the 


? For an explanation of inner a 
xp 


nd outer as 
ar upper and lower managem sepa nated 


ent levels, see Browne 


86 


The R, A, and D Scales? developed by 
Stogdill and Shartle (8) were used to obtain 
estimates of responsibility, authority, and 
delegation of authority. The method used = 
constructing the scales has been described by 
Browne (2). To measure employee attitudes, 
the morale scale devised by Harris (5) was 
used. This scale consists of 36 items, each 
having a discrimination value of 1.0 or higher. 
Five items which were not applicable to the 
utilities company were eliminated, leaving 31 
items which were used in this study and which 
provided a maximum score of 45.60 (the sum 
of the discrimination values of the 31 items). 
A total of 117 sets of forms were mailed to 
the divisions. Of these, 100 sets or 86 per cent 
were completed and returned directly to the 
authors. The completed forms included 8 
level A supervisors; 26 level B supervisors; 
and 66 level C supervisors. 


R, A, and D Scores 


The R, A, and D scores represent the 
person’s estimates of her responsibility, author- 
ity, and delegation of authority. The mean 
R, A, and D scores for each of the three 
supervisory levels are given in Table 1. 

Since the lower scores indicate a higher degre? 
of the factor measured, the mean scores 0 
3.61, 3.85, and 4.64 for the level A supervisors 
represent the highest estimates for R, A, and 
D, respectively. The mean scores of the level 
B supervisors represent the next highest, 
while the mean scores for level C supervisors 
represent the lowest. The trend of the mean 

? Persons interested in information regarding the f 
A, and D Scales may contact Dr. Ralph M. Stogdill, 


Associate Director, Personnel Research Board, 
Ohio State University, Columbus 10, Ohio. RA 
> fR, Ay 


juantitative interpretation 0. D, 
aor De used in the following discuss! o 
Instead, a qualitative interpretation will be used, ‘il 
that a discussion of a high R score, for example, WÍ 


represent an estimate of a high degree of R and a dis- 


Communication, Supervision, and Morale 87 


Table 1 
i Mean R, A, and D Scores* 


Supervisory Mean Mean Mean 
Level N R Score A Score D Score 
Level A 8 3.61 3.85 4.64 
Level B 26 3.82 4.52 5.05 
Level C 66 3.87 4.81 5.54 
Total Group 100 3.83 4.66 5.34 


* The range of possible scores on each scale is 1.0 to 
8.7. Itis important to note that the lower quantitative 
scores indicate a higher degree of the factor measured, 
while the higher quantitative scores indicate a lower 
degree. 


scores indicates that the subjects estimated 
the degree of their responsibility, authority, 
and delegation of authority in relation to their 
position in the company. That is, the closer 
the supervisory level of a group was to the 
focal point (3) of the organization, the higher 
the estimates of each of the factors was. 
This trend also was supported when the data 
were studied by divisions, districts, and 
individuals. 

For the total group and for each supervisory 
level, the figures in Table 1 also indicate that 
responsibility was estimated to be the greatest 
of the three factors, followed by authority and 
delegation of authority, as evidenced in the 
total group mean scores of 3.83, 4.66, and 5.34, 
respectively. With the exception of one 
district, this was consistently true when the 
data were analyzed by divisions and districts. 
The range of the individual scores was from 
2.72 to 6.78 for R; 2.82 to 6.98 for A; and 2.90 to 
7.55forD. The mode for the R scores was 4.0; 
for the A scores, 4.6; and for the D scores, 5.5. 
Here again, the same relationship is observed. 
For the individuals, 10 of the 86 cases estimated 
authority to be greater than responsibility, 
but the remaining 76 followed the trend of the 
total data in estimating responsibility to be 
greater than authority. 

The data, then, demonstrate that these 
supervisors did not estimate their responsibility 
and authority to be equal, as might ideally be 
expected. The product moment inter-correla- 


tions between the three factors were: R and 


cussion of a low R score will represent an estimate of a 
ow degree, 


A=.24; R and D=—.03; A and D=.22, 
These coefficients indicate some tendency for 
persons with high responsibility estimates to 
estimate authority high also, and for those with 
high authority estimates to estimate greater 
delegation of authority. However, the rela- 
tionships were not as high as reported in two 
previous studies. In the Ohio State Leader- 
ship Studies, unpublished correlations for a 
group of Naval officers were found to be .56 
for R and A; .16 for R and D; and .86 for A 
and D. Browne (2) in his study of business 
executives reported correlations of .56 for 
Rand D; .29 for R and D; and .54 for A and D. 
It will be noted, however, that the correlations 
in the three studies indicate the same general 
trend since the correlations between R and A 
and between A and D were larger throughout 
than the correlation between R and D. The 
variation in the size of the coefficients may be 
regarded as a function of the variation in the 
groups and the situations in which they were 
operating. 

If R and A were judged to be equal or if 
each person estimated them in the same 
proportionate relationship, the correlation 
between them would be 1.00. The extent to 
which the relationship deviates from this 
perfect, ideal relationship may be dependent 
upon two general variables: (1) the effective- 
ness of communication between supervisory 
levels; or (2) the clearness and specificity with 
which management has defined responsibility 
and authority for each supervisory level in the 
organizational set-up. Considering the com- 
parative figures given above, it would appear 
that these variables as represented by Rand A 
Scores were more satisfactorily understood and 
communicated in the military situation and 
in inner management levels than they were in 
the present situation which studied outer 
management on the first, second, and third 
levels of supervision. 

A correlation coefficient of unity between A 
and D would indicate that all persons estimated 
their delegation of authority equally in re- 
lationship to their estimates of authority. 
Obviously this perfect relationship need not 
be expected. In fact, it might indicate an 
undesirable condition within the organization. 
However, the extent to which individuals 
believe they are delegating authority may be 


88 C. G. Browne and Belty J. Neitzel 


i m the size of the correlation. In 
ees further research would be 
needed to ee what the most desirable 

i i ould be. 

ee to believe that the individ- 
ual’s estimate of responsibility Should be 
related to his delegation of authority. Theo- 
retically at least, while authority can be 
delegated, responsibility cannot, since an 
individual is always responsible to inner 
management levels for the responsibilities 
which have been assigned to him. The 
correlation, then, between R and D has little 
working meaning, although the lack of any 
necessary relationship between these two 
variables is supported in all three of the studies 
reported since the R and D correlation in 
each case is the lowest of the three, 


R, A, and D Disparity Scores 


In order to study the effectiveness of the 
communication of responsibility, authority, 
and delegation of authority, some measure of 
communication of these factors between the 
three supervisory levels was needed. For this 
purpose, disparity scores were used which 
represented the differences between the in- 
dividual’s estimations of R, A, and D for 
herself and the estimates of her supervisor or 


assistants, as appropriate, of the three factors 
for the individual, 


The scores of these sc: 
and “a” scores, 


estimate the delegati 
immediate seniors, 
respectively. The 
designated “d” scores, 

As an example, the R disparity score then 
for a level B supervisor is the difference 
between the R score of the level B supervisor 
and the “r” score of her level A supervisor 
Thus, the R disparity score represents the 
difference between the level supervisor’s 
thinking regarding her Tesponsibility and the 
thinking of her supervisor, In this way, 


the R disparity score serves as a measure of 
the communication of responsibility between 
adjoining levels of supervision.* On the same 
basis, the A disparity score for a level B 
supervisor is the difference between her A 
score and the “a” score of her level A super- 
visor. In this study, the disparity score was 
used without consideration of the algebraic 
sign. However, interest in another study we 
may be in the direction of the difference, an 
in this case the algebraic sign may be used. 

Whereas R and A disparity scores can have 
only one value since they depend on the 
estimations of only two individuals, the D 
disparity score of an individual may have as 
many values as she has people under her 
supervision. For the purposes of this study, 
a composite disparity score was used, calculated 
in the following manner. In the case of a 
level B supervisor, the difference between 
her D score and the “d” score of each of her 
level C assistants was determined. The mean 
of these differences is the D disparity score 
for the level B supervisor. 

The mean R disparity, A disparity, and D 
disparity scores for the total group were 81, 
77, and .36, respectively, while the medians 
were .72, .52, and 61. The range was 0.00 
to 2.52 for R disparity; 0.00 to 2.83 for A 
disparity; and .22. to 2.73 for D disparity. 
The medians give a more accurate picture of 
the results since the distribution had some 
extreme scores and did not yield a normal 
distribution. A 

Although the differences in median disparity 
Scores were not great, they indicate a tendency 
for R disparity scores to be greatest, followed 
by D disparity and then A disparity, but the 
A and D disparities are reversed in order when 
the means are considered. In no case did the 
three R, A, and D scores of any person agree 
with the three wpa a and “q” scores 
obtained from her supervisor and assistants. 
Since disparity scores are a means of stating 
quantitatively the extent of disagreement 
between a Person’s estimate of the factor 
measured and the estimate of her supervisor 
or assistant of the same factor for the same 
individual, they constitute a measure 0 


“Tt should be noted th; 


here can be used only be 
vision. 


at disparity scores as described 
tween adjoining levels of supe 


ae 


Communication, Supervision, and Morale 89 


communication between: the individuals. If 
communication between supervisory levels is 
complete, the responsibility and authority an 
individual believes he has should agree with 
his supervisor’s estimates of his responsibility 
and authority, and the individual’s estimate 
of authority delegated to his assistants should 
agree with the assistants’ estimate of what 
the senior has delegated. Differences in these 
agreements are revealed by disparity scores, 
the size of the score being a function of the 
difference in thinking between supervisory 
levels. 
Correlation coefficients were obtained between 
disparity scores and Harris morale scores, and 
between the deviation of the individual R, A, 
and D scores from the mean R, A, and D 
scores of individuals in the same job. The R 
mean deviation score will be used to illustrate 
the methods of obtaining the deviation scores. 
The mean R scores of level A, level B, and 
level C supervisors were determined. The 
R mean deviation score of each level A 
supervisor is the difference between her 
individual R score and the mean R score of 
all level A supervisors. The same procedure 
was followed for the other supervisory levels 
for R mean deviation and for A and D mean 
deviation scores for each supervisory level. 
Thus, the mean deviation scores are measures 
of the extent to which the individual’s esti- 
mations of R, A, and D in her own specific 
position are at variance with the mean 
estimation of R, A, and D of all individuals 
included in the study doing her particular job. 
In Table 2, the correlations between dis- 
parity scores, morale scores and the R, A, and 
D mean deviation scores are given. The 
coefficient of .56 between R mean deviation and 
R disparity and of .63 between D mean 
deviation and D disparity represent substantial 
relationships between these two variables. 
Although the correlation of .31 between A 
mean deviation and A disparity is smaller, it 
indicates the same trend. Thus, in all of 
these correlations, the indication is that those 
individuals who deviated most in their estima- 
tions of the three factors from the estimates 
of their total job group (mean deviation score) 
also were the individuals who were at greatest 
Variance with the estimates of their super- 
visors and assistants for the three factors 


Table 2 


Product Moment Correlations of R, A, and D Mean 
Deviations and Morale Scores with R, A, 
and D Disparity Scores 


R, A,and D 
: Mean 
Disparity N Deviations Morale 
R disparity 92 56 —.54 
A disparity 92 oh —.10 
D disparity 34 63 —.06 


related to their position (disparity score), 
For example, an individual estimate of respon- 
sibility that was higher or lower than the 
mean responsibility score of the echelon to 
which the individual belonged was likely 
also to be higher or lower than the estimate of 
her responsibility by her supervisor. 

The correlation of —.54 in Table 2 indicates 
that individuals with high morale scores 
tended to be in closer agreement with their 
supervisors regarding their level of respon- 
sibility since this would make for low disparity 
scores. If it is accepted that the disparity 
score is a measure of communication between 
supervisory levels, then the present evidence 
regarding morale would support the concept 
that communication is one of the influencing 
factors in the determination of morale, 
particularly as related to the responsibility 
variable. 


Morale Scores 


Each morale score represents the attitude 
of an individual toward company personnel 
policies. The maximum score of 45.60 was 
obtained by one level A supervisor and three 
level B supervisors. The lowest score for 
the group was 17.95 for a level C supervisor. 
In six of the eight districts, the level A super- 
visor had the highest morale score, and the 
mean of the level B supervisor scores was 
higher than the mean of the level C supervisor 
scores. In two districts, both in the same divi- 
sion of the company, the level B supervisors! ad 
the highest mean score followed by the level C 
supervisors, and the level A supervisor score was 
the lowest. Generally, however, the morale 
score was positively related to the echelon level 
of the supervisors, the inner level supervisors 


90 C. G. Browne and Belty J. Neitzel 


Table 3 


Product Moment Correlations of Morale Scores 
with R, A, and D Scores* 


Morale Score N R A D 
Level A** 8 —.36 —.16 AS 
Level B 26 — 47 —.39 08 
Level C 66 16 07 10 
Total Group 100 05 08 09 


* The sign for these correlations has been changed so 
that in interpreting the correlations a large score in one 
variable is also indicative of a large score in or a greater 
degree of the second variable. 


The correlations for level A were computed by the 
rank-difference method. The coefficients obtained were 
converted into their equivalent Pearson r coefficients. 


having the highest scores. It may be noted 
that R, A, and D scores also were positively 
related to echelon level. 

Table 3 includes the correlations between 
morale scores and R, A, and D scores for the 
three supervisory levels and for the total 
group. For the total group there is little 
relationship between the variables as indicated 
by the correlations of .05, .08, and .09 for 
R, A, and D, respectively. However, for 
R and A correlated with morale scores for 
the inner levels A and B supervisors, the 
range of correlation coefficients was —.16 to 
—.47. There appears, then, to be a definite 
trend in the inner supervisory levels, partic- 
ularly for those who estimated responsibility 
and authority higher, to have lower morale 
scores. This was not the case, however, 
with the outer level C supervisors, there being 
little relationship between the variables for 


them as reflected in the coefficients of .16, 
.07, and .10, 


Summary 


This study was an investigation of the 
communication of responsibility, authority 
and delegation of authority at three super. 
visory levels of a utilities company and included 
a study of employee morale in relation to the 
three factors. The R, A, and D Scales 
developed by Stogdill and Shartle and the 
Harris morale scale were used as 
instruments. As one measure of c 
tion, a disparity score was used w 
sented the differences between the 


measuring 
ommunica- 
hich repre- 
individual’s 


estimates of R, A, and D for herself and er 
estimates of her supervisor in the case -0 
R and A and the estimates of her assistants 
for D. a i 

The results of the investigation included the 
following: 


1. Individuals estimated their responsibility; 
authority, and delegation of authority 10 
relation to their position in the company, those 
nearer the focal point of the organization 
having higher scores on all three variables. 

2. Responsibility and authority were not 
estimated to be equal, but most subjects 
believed their responsibility exceeded their 
authority. 

3. Disparity scores (the differences between 
the individual’s estimates of R, A, and D for 
herself and the estimates of her supervisor pr 
assistants, as appropriate, of the three factors 
for the individual) produced no cases © 
agreement between individuals on varying 
levels of supervision, the amount of disparity 
being a measure of incomplete or unsatisfactory 
communication. 

4. There was a negative relationship between 
morale scores and disparity scores, this being 
particularly evidenced with R disparity scores 
which correlated —.54 with morale scores. 

5. Correlations of .56, 31, and .63 were 
obtained between the deviation of individual 
R, A, and D scores from the mean score of 
each supervisory level group and disparity 
scores for the three variables. er: 

6. Morale scores were found to be positively 
related to the echelon level of the supervisors» 
the inner level supervisors generally having the 
highest scores. In the inner supervisory levels 
there was a trend as indicated in correlation 
coefficients ranging from —.16 to —.47, for 
those who estimated responsibility and autho 
ity higher to have lower morale scores. 

Received May 14, 1951, 


References 


1. Barnard, C. I. 


3 The functions of the executive. Ca™™ 
bridge: Har 


vard University Press, 1947. ME 

2. Browne, C G Study of executive leadership sl 
business. I. The R, A, and D Scales. J. a$% 
Psychol., 1949, 33, 521-526. dä 

3. Browne, C. G. The concentric organization gui 
T- appl. Psychol., 1950, 34, 375-377. 


E e 


Communication, Supervision, and Morale 91 


l 4. Guilford, J. P. Fundamental statistics in psychology 


and education. New York: McGraw-Hill, 1942. 

5. Harris, F. J. The quantification of an industrial 
employee survey. J. appl. Psychol., 1949, 33, 
103-111. 

. Jucius, M. Personnel management. 
ard D. Irwin, Inc., 1947. 


ar 


a 


Chicago: Rich- 


7. Stogdill, R. M. Leadership, membership and organi. 
zation. Psychol. Bull., 1950, 47, 1-14. 


8. Stogdill, R. M., and Shartle, C. L. Methods for 
determining patterns of leadership in relation to 
organization structure and objectives... J, appl. 
Psychol., 1948, 32, 286-291. 


te 


Opinions on Communism of Air Force Police Trainees 


Major Norman E. Green 


s Air University, Human Resources Research Institute 


In the summer of 1950 the security con- 
sciousness of the United States Air Force 
rose to a new high. Increased international 
tensions and the results of security vulnerabil- 
ity surveys at certain air bases showed that 
a vital need existed for greater protection 
against subversive and sabotage activity. 
This was particularly important at fighter 
interceptor bases where the USAF is charged 
with first line defense of the nation against 
hostile activity in the air. It was equally 
important at long-range bomber bases where 
our aircraft and crews must be instantly ready 
shoulda retaliatory air strike become necessary, 
The lack of an airtight security plan might 
feasibly result in a crippling blow before any 
effective reaction could be made. Part of 
the answer to this need was seen as the prompt 
training of an increased force of Air Policemen. 
A school for this purpose was established at 
Tyndall Air Force Base, Florida, and in 
September 1950 its doors were opened for the 
young airmen students. 


The Problem Situation 


The course centered around anti-sabotage 
measures and the development of proficiency 
in weapons and unarmed combat. Through- 
out the period of schooling, instruction of a 
motivational and informational nature was 
also presented. ‘This included such subjects 
as Career opportunities in the Air Police 
System and discussions of communism as a 
threat to the American way of life and to 
the security of the USAF. The psychological 
preparation of the airman for his new duty 
was not neglected nor made secondary to the 
physical preparation. This, of course, was in 
line with policies on troop information in 
general and was commensurate with the 
new-common knowledge that the best informed 
airmen are characterized by higher morale and 
efficiency. 

Three 45-minute periods were allotted for 
instruction on communism. The rationale 
behind this instruction was to Present facts 


92 


about communist activity as it is operating 
in the world today and not to dwell upon 
discussions of political philosophy and what 
might be or could be. i 
The first period served as an introduction 
during which the instructor described the 
threat of communist sabotage at vital Air 
Force bases and showed a film depicting the 
origin and growth of communism, its patterns 
of aggression and its subversive methods. Fa 
The period for the second week was ca ii 
“Communism in the United States.” With 
two instructors taking part, it was prosena 
in question and answer form with several a 
the questions and comments coming from t Ri 
students themselves. The following are typica 
of the questions discussed : What is commun T 
Has any nation ever gone communist in a m 
election? How do communists try to oF 
control? Under communist rule: could a 
belong to a union; could I go to school; cou 
I change my job; could I travel around i 
country as I please; could I teach what 
want with “academic freedom”? How many 
communists are there in the United States 
Where are their headquarters in the Unite 
States? What is the communist party eek 
in the United States? What does one me 
to do to join the communist party? How r 
communists get control of organizations l 
which the majority are not communists? d 
During the third week the discussion metho i 
was used in a similar manner for the subject 
of “Communism and Religion.” The follow 
ing questions are typical of those consideré i 
If communism should come to the U. S. cou z 
I belong to a church? What would th 
commumists do to the churches and synagogues 
What is the communist faith? Do 
communists pretend to tolerate religion today 
How would my child learn his religion? he 
would own the churches? What is 
“Peoples Institute of Applied Religio” 


5 er 
How are priests and ministers treated uri 
communist rule? 


? 


Opinions on Communism of Air Force Police Trainees 93 


Throughout these three class periods empha- 
sis was put on the close tie-in between the 
success of the Air Police mission and the 
success of the whole Air Force mission. 


The Problem of Attitudes 


To obtain information on the depth and 
direction of the Air Police trainees’ attitudes 
toward communism was considered important 
for three reasons: (1) The results of such an 
inquiry would serve as an appraisal of certain 
learning outcomes; (2) these results would also 
provide data on the modifiability of attitudes 
in a school situation; and (3) some insight 
about the quality of the airmen’s psychological 
preparation for their responsibilities would 
be gained. Accordingly, the investigation of 
attitudes was undertaken. 


The Population 


The population used for the study included 
four classes of airmen Air Police students 
totaling 1,974 subjects. These airmen had 
been sent direct to Tyndall Air Force Base 
from the Indoctrination Wing at Lackland 
Air Force Base, Texas, where they had taken 
their basic recruit training during a stay of 
approximately four weeks. The new airmen, 
most of whom were high-school graduates, had 
voluntarily enlisted and had come from homes 
all over the country. Two classes totaling 
992 incoming students were used as the control 
group and two classes comprising 982 outgoing 
students served as the experimental group. 
Original selection procedures and qualification 
requirements (physical examinations and 
minimum AGCT score of 90) for Air Force 
service were the same for all subjects. All 
airmen were assigned to this training to fill 
the immediate need described above. The 
same instruction on communism and all other 
matters was given to all airmen in the popula- 
tion. In addition, as shown in Table 1, the 
control group and the experimental group can 
be considered alike jn age and amount of 
formal schooling. 


Procedure 


The ten statements were composed by the 
writer to serve as the opinion yardstick. 
They were made purposefully strong in tone 
to provide opportunity for indication of 


Table 1 


Data on Age and Educational Achievement for Control 
Group (N = 992) and Experimental 
Group (N = 982) 


Control Experimental 

Group Group 
Mean Age 20.0 19.8 
Mean Years of School 11.8 11.6 


Per Cent Completed 7th, 
8th and 9th Grades Only 3.4 8.0 
Per Cent Completed 10th 


and 11th Grades Only 18.1 19.3 
Per Cent Completed High 

School Only 65.5 61.9 
Per Cent with some 

College Work 13.0* 10.8** 


* Includes 3 college graduates. 
** Includes 4 college graduates, 


intensity of opinion. As such, the statements 
do not reflect the tone of the instruction 
presented. For each item spaces were pro- 
vided for expressing “no comment,” “strongly 
disagree,” “disagree,” “agree,” and “strongly 
agree.” Complete anonymity of respondents 
was maintained throughout the study. 

The control group of 992 incoming students 
completed the survey schedule on the morning 
of the first day of classes before any instruction 
was given. The experimental group of 982 
outgoing students completed the form during 
the third week of the course after the third 
and final period of instruction on communism. 
The responses of the two groups were tabulated 
separately and the data were organized to 
show comparisons of the two groups on a 
before and after basis. Thus, each statement 
was analyzed by showing percentages of 
agreement and disagreement giving, by in- 
ference, a picture of the shifts and changes 
in direction and intensity of opinion. The 
“before” and “after” differences in feelings 
about communism were then tested for 
statistical significance. 


Results 


The results of the analysis are presented 
below for each statement individually. The 
figures are expressed in per cent and critical 
ratios are given to show the statistical signif- 
icance of differences for each response category. 


94 Major Norman E. Green 


1. Communism is a plan to rule the world. 


Strongly 

Ni Strongly 

Tominenie Disagree Disagree Agree Agree 
50 

Control Group 3 2.3 3 Fa 7 

Experimental Group 2 8 0 : no 

Critical Ratio = —2.8 —5.4 —9,5 E 


2. Under communism Americans w 


ould have lo obey orders from the bosses or be put in jail or shol. 


5 ly 
No Strongly Strongly 
Comment Disagree Disagree Agree Agree 
Control Group 4 1.1 2 49 44 
Experimental Group 3 8 0 29 67 i 
Critical Ratio = 21 -.7 =3.3 —9.5 10. 
3. A communist is the next thing to a guller rat. a 
z ly 
No Strongly Strong! 
Comment Disagree Disagree Agree Agree 
Control Group 19 2.4 12 30 36 
Experimental Group 13 1.5 4 21 60 
Critical Ratio = 44 =15 —6.5 -4.5 10.9 
4. Communists would disobey the A lmighty rather than disobey their leader, 
No Strongly Strongly 
Comment Disagree Disagree Agree ARDES a 
Control Group ll 1 3 41 44 | 
Experimental Group + 1 1 28 66 ? 
Critical Ratio -5.7 = -2.2 —6.0 98 y 
J. Communists are eager lo destroy all our airplanes. 
—: 
No Strongly Strongly 
Comment Disagree Disagree Agree Agree a 
Control Group 23 1.4 10 38 28 
Experimental Group 5 A k- 26 67 
Critical Ratio -ARŠ —14 —78 —5.7 17.5 
6. A communist would give up his wife before hed give up the party. hi 
a No Strongly Strongly 
peras Comment Disagree Disagree Agree Agree _ 
Control Group 22 9 5 28 { 
Experimental Group 20 é 4 44 41 
Critical Ratio ay my -y = 6.0 
oa: ees 


} 


Opinions on Communism of Air Force Police Trainees 
95 


7. Con ists wi i 
mmunists want to have more children to help build up the Red army 


Comment ‘Dias 
-—- = nen jisagree i y Strongly 
Control n = : Disagree Agree A ae 
Experimental Group 12 8 2 49 27 
Critical Ratio as 16 1 38 48 
ae » : S15: —4,9 
è 9.8 
iris ol ee . 
a * ical would like to blow up every church they could. 
ĝa No SEY ~ si 
a R E “omment isagree Disagree Strongly 
Control Group y “a as Agree __ Agree 
perimental Group 8 8 f 35 30 
Critical Ratio <j] 9 2 8 34 52 
ere = = —0.4 0.7 
6 Tha saii E 
9. The communists have plans to take over all of the United States 
_ No Strongly 
-S Comment Disagree Disagree ee Strongly 
Control Group 17 1.3 > a a Agree 
Experimental Group 12 6 0 a 39 
Critical Ratio = 99 16 at 3a 54 
i =e —3.7 6.9 
1. A communist would wipe his feel on the American flag before he'd salute it 
m No , Strongly S 
T Tomment Disagree Disagree Agree trongly 
Control Group 14 6 basi e= ~ Agree” 
Experimental Group 13 11 3 = ae 
Critical Ratio =A 13 3.6 24 58 
k —3.6 =67 a 
6.7 


Discussion of Results 


An examination and study of the data make 
certain facts eminently clear and provide a 


basis for further inferences. 

On all ten statements a 
an overwhelming majority 
airmen were in agreement when they entered 
the school. This agreement ranged from a 
majority of 65 per cent on item 8 (Communists 
would like to blow up every church they could) 


to a majority of 93 per cent on item 2 (Under 


communism Americans would have to obey 


orders from the bosses or be put in jail or 
shot), the average agreement being 77.5 per 


bout communism, 
of this sample of 


cent. 
Following the third period on communism, 


the majority agreement had increased to a 


range of from 75 per cent on i 
munist would give up his nile hd, a 
give up the party) to a majority of 97 See 
on item 1 (Communism is a plan to Ho 
world), the average agreement now aiea T 
87.7 per cent. This change was oe an 
principally but not entirely in E “gts 
intensity of opinion from “agree” om — = 
agree” and is statistically signifi ery 
.01 level of confidence. ` ° See All 
Again considering a 
number of rac A he rei me 
category was significantly reduced pel 
cent, and those who originally a / per 
disagreed or disagreed were in a K ae 
smaller minority (2.8%) after the a 
experience. The pattern of these clatises i l 
in 


96 Major Norman E. Green 


direction and intensity of opinion was con- 
sistently positive and vectored to the categories 
of agree and strongly agree. 

It is somewhat revealing if not surprising 
to find such a consensus. of anti-communist 
feeling within this population of American 
youth. However, this general attitude may 
be at least partially accounted for by the 
fact that the group had already been motivated 
in this direction to the point of voluntary 
enlistment in the service. Thus, the strength 
of the attitude is mirrored in the decision to 
“join up.” Whatever may. be the true 
similarity between these opinions and those 
of the nation as a whole, the suggestion is 
inescapable that Americans may have much 
stronger feelings on this entire issue th 
leaders imagine. 


This study supports other investigations 
showing that a planned program of information 


an their 


can result in definite shifts of attitudes. If it 
is desired that a population be better prepared 
psychologically to meet an aggressor, a simple 
plan for the communication of ideas will go 
a long way toward doing the job. i 

From the standpoint of their preparation 
attitude-wise for their responsibility for security 
and protection within USAF installations, 
these airmen may be said to be well-equipped. 
This is true not only because the attitudes 
exist, but because they exist in strength and 
intensity. Attitudes are characterized by a 
behavior or action component. This might 
extend from a mere inclination to vote “yes? 
to a most forceful or even sacrificial reaction. 
The anti-communist feelings of these air 
policemen, therefore, imply a definite readiness 
to respond appropriately to certain persons and 
situations. 


Received May%3, 1951. 


2 


Studies in Job Evaluation. 


9. Validity of a Check List for 


Evaluating Office Jobs * 


Minnie Caddell Miles 


Occupational Research Center, Purdue University 


a a number of years increasing emphasis 
= een attached to the study of job evalua- 
on. World War II gave an impetus to this 
Movement. Almost overnight industrial 
Managers all over the country became more 
eenly aware than ever before of the tremen- 
dous need for adequate methods for evaluating 
Jobs. This need continues to exist today. 
Particularly is this true in the area of office 
Jobs. In spite of the vast amount of material 
Published on the progress of job evaluation 
Programs, a relatively small amount pertains 
to the office. As late as 1945, according to 
Ells (5), few companies had classifications for 
office employees other than for a few clerical 
and stenographic jobs. In 1949 a survey by 
the National Office Management Association 
revealed that only 32 per cent of the companies 
reported office job evaluation plans (13). 

The difficulties of studying office jobs, 
because of the lack of standardization and the 
multiplicity of duties sometimes performed 
by a single job incumbent, are no doubt 
partially responsible for this fact. In addition, 
one is faced with intangibles which are difficult 
to measure; with numerous jobs which cannot 
be put on a measured production basis; plus 
the fact that few clerical jobs remain the 
same over a period of time (9). And, by no 
means a minor factor is the difficulty in 
determining dependable market rate com- 
parisons (2). However, these problems em- 
phasize all the more the need for careful 
study. 

A possible approach for overcoming many 
of these problems is that of a check-list. It is 
the purpose of this study to determine the 
validity of a Job Description Check-List for 
the evaluation of office jobs. The underlying 
assumption is that by using paired-comparison 
s based upon a thesis submitted to the 


| of Purdue University in partial fulfill- 
egree of Doctor of 


The research was 
H. Lawshe. 


* This paper i 
Graduate Schoo 
ment of the requirements for the d 
Philosophy, January, 1951 (10a). 
done under the direction of Dr. C. 


job ratings as criteria, we can d i 
) Ti etermi 
validity of the check-list device. nee 


Background of Problem 


Two different groups have devoted research 
to the use of job elements in the stud = 
office jobs. The first such research ing b 
members of the Life Office Mana tes 
ae an the second was mee a 
sponsorship of the i 
cg Purdue hie iin 

ob Elements as a i 
Evaluation. The Life ome at 
Association, having a fertile field for cede 
among the thousands of office empl ia 
the life insurance business, has ina 2 cele 
of years carried on a continuing project tn the 
evaluation of office jobs. As a a a 
Clerical Salary Study Committee of j this 
organization developed, or was responsible f i 
the development of, what is know. Ré 
Job Element Evaluation Plan. Und a : a 
plan 149 clerical operations were distin, a bes 
and their relative values determined i a 
felt that with such a plan the writin Sob 
descriptions would be more AOUT Sra 
comparisons more valid, and the compa ial 
wages and salaries among different coin 
seater Ff detailed discussion of this plan 

o be fou: i 
oe nd in Clerical Salary Administration 

Development of an ‘ 
Check-List. In lowe KEE oe ee Mion 
Culbertson set out to determine the sme 
of an operational check-list for the de: rtd 
of clerical jobs (3). From his oat 
ence and from a survey of the ea he 
identified the basic operations which co SEN n 
clerical activity. After trial verat oat 
the items and experimental tryout of te 
check-list, it was concluded that it wl | 
adequate for describing clerical jobs ie 

Revision and Application of the Job Descri 
tion Check-List. As a further devlent of 


97 


98 Minnie Caddell Miles 


the check-list approach, Dudek (4) attempted 
to devise a job evaluation plan which would 
be relatively simple and easy to grasp, as well 
as easily administered. He hypothesized that 
these aims could be achieved by: (1) identifying 
all tasks or operations involved in a class of 
jobs; (2) evaluating these operations on a 
relative scale; and (3) evaluating each job in 
terms of the relative amount of time spent on 
each task. 

Using Culbertson’s (3) original check-list 
‘as a basis, Dudek revised the items, and deter- 
mined a scale value for each. This check-list 
was tried out with a group of some 150 office 
employees in a radio plant. It was concluded 
that the check-list adequately described the 
tasks of the office workers in the study. 
Further research was suggested, however, for 
demonstrating the adequacy of such an 
instrument for job evaluation purposes. 


Procedure and Results 


The Job Description Check-List of Office 
Operations' used in the present study is a 
slightly: revised form of the one developed by 
Culbertson and Dudek. 

Procurement of Data. Each of the three 
companies from whom basic information was 
obtained is engaged in a different type of 
operation from that of the others, as well as 
being located in a different section of the 
country. A foundry, located in the South, 
supplied part of the data. Another contributor 
was a manufacturer of office filing supplies 
located in the East, and a third was a member 
of the automotive industry located in the 
Midwest. Data for cross-validation purposes 
were obtained from two steel mills, one in the 
Midwest and the other in the East. 

For purposes of convenience, 
pating concerns will be referred 
the study as: Company L 
Company II, the filing concern; 


these partici- 
to throughout 
the foundry; 
Company IIT, 


1 The detailed check-list has been vi 
American Documentation Institute, ond Deca 
3267 from American Documentation Institute, 1719 N 
Street, N.W., Washington 6, D. C., remitting $1.00 for 
microfilm (images 1 inch high on standard 35 mm 
motion picture film) or $1.20 for Photocopies (6 X $ 
inches) readable without optical aid, Printed copies 
may be obtained by writing to Dr. C, H. TRS 
Occupational Research Center, Purdue University. 
Lafayette, Indiana. Hi 


the automotive plant; and Companies IV and 
V, the steel mills. 

Key Office Jobs from Cooperating Concerns. 
Each of the companies was asked to submit a 
list of 25 key office jobs according to suggested 
criteria. The key jobs were to be distributed 
throughout the entire range at present pay 
rates; were to “sample” the various areas of 
work being performed; should not be in 
dispute in regard to pay rates; and should be 
relatively well known by at least four or five 
people who were qualified to rate them. 

Companies II, III, IV, and V submitted 
lists of 25 key jobs, but Company I had only 
15 jobs which met the criteria. However, the 
final number of check-lists used were 14 for 
Company I, 25 for Company II, 20 for Com- 
panies III and IV, and 24 for Company V. 
Company II actually supplied 43 check-lists, 
which included one for each job incumbent in 
each key job; but, in instances where more 
than one check-list was prepared for a job, 
the scale values were averaged to arrive at the 
mean values used in the computations. The 
number of check-lists for some of the companies 
was reduced from the original number of key 
jobs because of the discontinuance of the job 
before the check-list phase of the study was 
completed. In other instances insufficient 
data were given to enable the inclusion of the 
check-list. 

Paired-Comparison Ratings of Key Jobs by 
Selected Judges. Upon receipt of the lists of 
key jobs, IBM cards for paired-comparison 
ratings were mailed to each company. These 
cards* were marked independently by five 
raters in Company I and by four raters in each 
of the other companies. Typewritten instruc- 
tions for the raters accompanied each set of 
cards. 

Check-Lists Completed for Key Jobs. AS 
soon as the paired-comparison rating cards 
were received from a company, the check-lists 
were mailed with accompanying instructions 
for completion. Each job incumbent and 
his immediate supervisor were requested tO 
check independently the duties performed on 
thejob. A third party, usually the coordinator 
of the research within the company, compared 
the two and identified any points of difference- 
A conference was then held with the incumbent 


Studies in Job Evaluation 99 


and the supervisor in order to reach an agree- 
ment on these differences. 

After agreement had been reached, a con- 
ference was held with the supervisor alone to 
determine which of the operations were 
considered most important to the job. Com- 
panies I and III indicated the operations 
judged most important by marking them 
1, 2, 3, and on through 10, in the order of their 
importance. Companies II, IV, and V in- 
dicated the most important operations by 
marking the five operations judged most 
important “A,” the five second in importance 
“B,” and the third five “C.” 

Analysis of Datla. The first step was to 
determine the reliability of the judges paire 
comparison ratings of the key jobs, which 
ratings were to be used as mma for determin- 
ing the validity of the check-ast. 

Correlations of Judges’ Ratings—Used as 
Criteria. The average of the intercorrelations, 
obtained by means of F 
tions (7), ranged na 7 
to .93 for Company V- a 

By eee pals formula (10) the ae 
ity of the judges’ ratings waz determina 
The relatively high intercorrelations, _ os 
as the resulting reliability figures, led ‘ oe 
conclusion that all judges should be include: 
in the criterion measures with the exception 
of one judge for Company. IV. j 

For each job in a particular company, 


f the paired-comparison ratings of 
as a criterion value. 


these pooled ratings 
r the five companies, 
f the Spearman-Brown 


formula (11) from the average stereon t 
referred to above, were = Silos I, .95; I 
94; TIT, .94; IV, .95; and V, 98. 
With’ criterion values as reliable as these, 
it was possible to make a comparison between 
the scale values of the check-list operations 
marked for a particular job and the value 
assigned to this same job by the, judges. 
Scale values had been previously assigned n 
the check-list operations by & group k 
experienced managerial judges during Dudek s 
study. The problem in the present study was 
aet iniia ways of combining check-list 


values to obtain optimum agreement with the 
criterion values. l 


isher’s Z transforma- 
79 for Company Tl 


the 


average 0 
the judges was used 
The reliabilities of 
(criterion values) fo 
estimated by means 0 


COEFFICIENT OF CORRELATION 


1234567890 
NUMBER OF OPERATIONS USED 


zs 29 


Fic. 1. Correlations of the means of the highest] 
scale values with the criteria. 


As indicated earlier, Companies IV and V 
were to constitute “hold-out” groups. Con- 
sequently, attempts to derive weighting 
schemes were confined to Companies I, IT, 
and III. 

Correlations of Check-List Operations with the 
Criteria. It has been suggested that a 
relatively small number of the highest level 
operations performed by a job incumbent 
actually account for the over-all job level. If 
this is true, some scheme could be devised to 
isolate these critical operations and to give 
them a much greater weight than is given to 
the more or less routine types of activity that 
may be performed by nearly all office em- 
ployees. 

An attempt was made to identify these 
critical operations by selecting, from the ten 
operations judged most important by the 
supervisor of the job, the single operation 
having the highest scale value. Correlations 
computed for each of the three companies 
between these single scale values and the 
criterion ratings ranged from .51 to .74, with 
an average of .66. These values are shown 
graphically in Figure 1. Similarly, the mean 
scale value of the two operations with the 
highest scale values for each job was computed 
correlated with the ratings, and the results 
plotted in Figure 1. This process was con- 
tinued, using the highest three, the highest four 
etc., until ten operations were “iel. To 
complete the picture, the mean scale value of 
all operations marked for each job was com- 


puted, the correlations determined, and plotted 
in the figure. 


100 Minnie Caddell Miles 


GOEFFICIENT OF CORRELATION 


O e 
"234567569 0 


24 29 
NUMBER OF OPERATIONS USED 


Fic. 2. Correlations of the means of the operations 
judged most important with the criteria, 


It will be noted in Figure 1 that there isa 
systematic increase up through five operations 
and that the curve then descends. In other 
words, in so far as one can generalize from these 
data, it appears that the means of the five 
operations with the highest scale values 
correlate higher with the criterion judgments 
than do the means of fewer operations, or more 
operations. 


Another attack involved disregarding the 
magnitude of the scale values of the operations 
as such and considering the items in terms of 
their relative importance as estimated by the 
supervisors, For Companies I and IIT the 
procedure was identical to that described above 
except that the single operation used was the 
one judged most important; when two opera- 


tions were used they were the two judged most 
Important, etc, 


with these two 


the curve in addition 
g all operations. As 


; mean trend of these 


correlations is similar to the trend in Figure 1 
in that there seems to be an optimum point 
around four or five Operations where the 
correlation appears higher than it does when 
fewer or more operations are considered. 

The maximum average correlatio 
1 is .79, whereas the maximu: 
correlation in Figure 2 is .84. 


n in Figure 
m average 
It would be 


difficult, if not impossible, to demonstrate a 
statistically significant difference between these 


‘two values, but in view of the fact that higher 


correlations were obtained by using those 
operations judged most important in contrast 
to those operations with the highest values, 
it seems justifiable to utilize the former as a 
basis for a weighting system. : 

Having obtained correlations averaging .84 
between the mean scale value of the five 
operations judged most important and the 
criterion values of the jobs, the next question 
concerned the remaining operations. Would 
the utilization of the scale values of _ the 
remaining operations performed materially 
change the correlations? . 

In an attempt to answer this question, 
the correlations of the five operations judged 
most important were considered as the A 
group; the next five in importance as group B; 
and all remaining operations were included in 
group C. These r’s obtained from correlating 
the means of groups A, B, and C with the 
criteria were used for computing a shrunken 
multiple R for each company, following the 
Wherry-Doolittle selection procedure (12). 
The shrunken multiple R’s with the inclusion 
of groups A, B, and C range from .78 for Com- 
pany I to .86 for Company III. In each of 
the three companies a smaller shrunken 
multiple R resulted from the inclusion of all 
three groups. Essentially no increase was re- 
alized from the inclusion of any group other 
than A. The use of groups A and B in Com- 
pany IIT changed the R only from .863 to .866, 
whereas in Company II the use of groups A and 
C changed the R from .795 to.788. Other com- 
binations of operations failed to yield higher 
correlations, 

Cross-Validation on “Hold-Out? Groups: 
From the 7’s obtained by correlating the means 
of the five most important operations with 
the criteria, and the means of all other opera- 
tions correlated with the criteria, a multiple- 
regression equation was computed for each 
company (8). These Tegression equations 
further indicated that little, if anything, is 
added by the inclusion of operations beyon 
the five considered most important by the 
supervisors. However, although certain slight 
tages might accrue from 
€ most important operations, 


— 


7 


Studies in Job Evaluation 101 


Table 1 


Multiple Correlations and Correlations Derived by 
Generalized Regression Equation* 


Company Jobs cR r 
I 14 .790 776 
I 25 .793 801 
nI 20 855 858 
IVi 20 884 890 
Vi 24 884 867 


five opera- 


* Computed from the mean values of the 
alues of all 


none judged most important and the mean V 
ot! her operations. 
t “Hold-out” groups. 


it is felt that these advantages are counter- 
balanced by the improved employee relations 
which might result from the inclusion of all 
apart ofa particular job. 
Consequently, 2 generalized regression equa- 


tion, Xi=4X2+Xa, was setup. This equation 


gives four times as much weight to the mean 
tions judged most 


values of the five opera 
important as is given to the mean values of 


all other operations. While this gives greatest 
emphasis to the five most important opera- 
tions, it is felt that it likewise gives adequate 
recognition to all other operations included in 
the job. 

This regression equation was applied to the 
data from the two steel mills which constituted 
the “hold-out” groups. The -resulting 7 for 
Company IV was 'g90 as compared with the 
shrunken multiple R of .884 obtained by 
correlating the mean of the five most important 
operations with the criteria, plus the mean of 
all other operations. For Company V the 7 
which resulted was 867 as compared to the 
dickens malple R of 284. Xt will be 
noted from Table 1 that the results were 
equally comparable when the generalized 
regression equation was applied to the three 
=o which furnished the basic data. 
ie appear to be sufficient proof of 
wie of the generalized equation, as 

of the validity of the check-lists. 


operations checked as 


Summary and Conclusions 


Th f 
the an ee of this study was to determine 
for ed of a Job Description Check-List 

ating office jobs. A revision of the 


check-list developed at Purdue University 
was used in the study. 

Data were obtained from five companies 
Key office jobs were rated on a paired-compari- 
son basis by selected judges; check-lists were 
completed for the key jobs; and the ten 
operations judged most important to the 
job were indicated. Judges’ ratings were 
used as criteria. After computing zero-order 
y’s and shrunken multiple R’s, a generalized 
multiple-regression equation was set up for 
cross-validation purposes on data from tw 
“hold-out” groups. : i 

The following conclusions may be drawn: 


1. The judges’ ratings, which were used a 
criteria, had high reliabilities. 3 

2. The five operations judged most import- 
tant toa job appear to be the optimum number 
for evaluation purposes. Neither the zero- 
order r based on more operations nor 4H 
shrunken ee R computed from caine 
groupings of operati i igni . 
pp EN ze yields a significantly 

3. For the promotion of good employee 
relations, it is considered advisable that all 
operations be included, with the five most 
important operations being given most weight 
A The generalized regression s 
X,=4X.+X3, when applied to the two “hold- 
out” groups predicted the criterion valu 4 
almost as well as did the regression equati i 
derived directly from the data. i 

5. Within the limits of this 
Description . Check-List of oe 
appear: id i t 
ao ong be a valid instrument for evaluation 


Received May 7, 1951. 


References 


1. Bellows, R. M., and E 
3 » R. M, Estep, M. 
qaim simplified; the utility Err D 
onal characteristics check-list RS 
y pull, 1988, 3, 354-359 a 
. Burk, S. L. H. A case his i 
>. J . story in sal: í 
x acco Bon, Personnel, 1939, TA 
PEI r = The adequacy of an operational 
Capable re description of clerical jobs. 
{Ou -S. thesis, Purdue University, 
4. Dudek, E.E. A 
ek, 1. E. An operational ap 
ee aie proach to the evalu- 
jobs. i 
Sates Une oe Ph.D. thesis, 


102 Minnie Caddell Miles 


5. Ells, R.W. Salary and wage administration. New 
York: McGraw-Hill, 1945. 

6. Ferguson, L. W. Clerical salary administration. 
New York: Life Office Management Association, 
1948. 

7. Fisher, R. A., and Yates, F. Statistical tables. 
(3rd ed.) New York: Hafner Publishing, 1949. 

8. Guilford, J.P. Psychometric methods. New York: 
McGraw-Hill, 1936. 


9. Kelly, Chalice. Job analysis a basis for payment 


according to output. A. M.A. Office Mgmt. Ser., 
1930, 53, 2-16. 


10. Kelley, T. L. Fundamentals of statistics. Cam- 
bridge, Mass.: Harvard University Press, 1947. 
10a. Miles, Minnie Caddell. The validity of a job de- 
scription check list for evaluating office jobs. Un- 
published Ph.D. thesis, Purdue University, 1951. 

11. Peters C. C., and Van Voorhis, W. R. Statistical 
procedures and their mathematical bases. New 
York: McGraw-Hill, 1940. 

12. Stead, W. H., Shartle, C. L., and Associates, Oc- 
cupational counseling techniques. New York: 
American Book, 1940, 

13. Trends in office personnel problems. Anon. Mgmt. 
Rev., 1949, 38, 301-302. 


Specificity of Over- and Under-Achievement in College Courses 


William C. Krathwohl 


Institute for Psychological Services, Illinois Institute of Technology 


Some work has been done to measure the 
intangible traits of industriousness and indo- 
lence! in contrasting fields. The fields which 
have been investigated were those of math- 


ematics and of English by Krathwohl (3, 4,- 


5, 6). A question which naturally arises 1S 
whether these traits carry over from one 
discipline to another, whether industriousnes 
and indolence are general or specific an 
perhaps related to specific interest. 2 

One way to answer the question P pi 
independence of work habits is to find a devi : 
for measuring industriousness mM a field, = 
then to correlate such measures with similar 
Measures in another field. . 

Seh a dece, which will measure A 
triousness or indolence in a field, consists z ra 
comparison of the scores received on ee ji 
and achievement tests. If the score © a 
individual on an achievement test in pone 
subject is appreciably higher than his “4 4 
on the corresponding aptitude test, su F = 
Person is defined as being industrious 1n ha 
Subject. If his scores are about the same, 
he is defined as being normal; but if his score on 
the achievement test is appreciably lower ce 
his score on the ee a he is defined a 
being i in that field. 

To eae nih scores comparable, Pe 
transformation of raw scores to stana os 
Scores (x/o) is the conventional proce ee 
In this investigation derived scores W at 5 
which have a mean of 20, a standard dev jation 
of 4, and are rounded off to the nearest integer 

The experiment to determine the indepen : 
ence of the indexes of industriousness was se 
up with 308 second term sophomores who had 
taken the sophomore achievement tests a 
May 1948 at the Illinois Institute of Tech- 
nology. The tests were on English expression, 
chemistry, mathematics, and physics. These 

For conciseness and to avoid awkward construction, 
he word indolence as employed in this investigation 5 
used not in a derogatory sense, but rather as @ sul 


i vay. vord 
Stitute for under-achiever. In the same way Ae aa 
industrious is used as a substitute for over-achievet- 


tests were constructed by the Measurement 
and Guidance Project of the Educational 
Testing Service. The aptitude tests for chem- 
istry, mathematics and physics were form M 
of the Chemistry Aptitude, Mathematics 
Aptitude and Physics Aptitude Tests, respec- 
tively, which are published by the Bureau of 
Educational Research and Service of the 
University of Iowa. The aptitude test selected 
for English was the short fifteen minute 
vocabulary section of the Cooperative Reading 
Comprehension Test, Advanced Form, which 
is published by the Cooperative Test Service, 
now the Educational Testing Service. The 
reason for the selection of a vocabulary test 
as an English aptitude test is given in an 
article by Krathwohl (5). 

Correlations between aptitude and achieve- 
ment varied from 0.42 for mathematics to 
0.58 for English and all coefficients were 
statistically significant. 

The index of industriousness, briefly I.I., 
say for chemistry, for any student was defined 
to be his derived score on the chemistry 
achievement test minus his derived score on 
the chemistry aptitude test. Indexes of 
industriousness for the remaining three subjects 
were computed in a similar manner. Normal 
students were considered to be those students 
whose I.I.’s ranged from —2 to plus 2. These 
normal students constituted approximately 
the middle 50 per cent of the group and so could 
be classed as average in work habits in the 
sense that the word average usually is employed 
in psychology. Industrious students were 
defined to be those whose I.I.’s were equal to 
or greater than 3 and constituted approx- 
imately 25 per cent, or practically the upper 
quartile, of the 308 sophomores in the experi- 
ment. Indolent students were defined to be 
those whose I.I.’s were equal to or less than 
—3 and constituted approximately 25 per 
cent, or practically the lowest quartile, of the 
entire group. 


The four aptitude tests were taken in 


103 


104 


Table 1 


Correlations Between Indexes of Industriousness 


Indexes r t 
English with chemistry .08 13 
English with physics A3 14 
English with mathematics AS 2.3 
Chemistry with physics .15 1.2 
Mathematics with physics .26 1.6 
Mathematics with chemistry .34* 4.7 
Mathematics with physics** .18* 2.5 


* Significant at one per cent level. 
** Freshmen. 


September: 1946 and the four achievement 
tests were taken in May 1948 so that one year 
and eight months elapsed between the taking 
of the aptitude tests and the taking of the 
achievement tests. During this period of 
time, these students had been exposed to the 
vicissitudes of college life, subjected to the 
temptations of extra-curricular activities and, 

` in general, had had an opportunity to settle 
down to somewhat more steady study habits 
than they had in the beginning of their 
freshman year. One advantage of using 
tests almost two years apart is the elimination 
of the necessity for proving the persistence 
of the indexes of industriousness over a two 
year period. A second advantage is that the 
computation of LI.’s over a two year period 
reflects the changes that may have occurred in 
a student’s work habits. 

That indexes of industriousness really meas- 
ured the effect of the industrious and indolent 
work habits for at least two of such diverse 
subjects as mathematics and English has 
been shown by Krathwohl (3, 4, 5, 6). 

The correlation coefficients between these 
various indexes of industriousness are shown 
in Table 1, where the first column gives the 


value of the correlation coefficients, and the 
second column gives the t-ratios, 


tion of each group is identical wit} 
same named group in Table 2. 

It is evident from these correlation coeffi- 
cients that those for the English LI, with 
each of the remaining three indexes are so low 
that the conclusion cannot be drawn that 
industriousness or indolence in English implies 
the same type of work habits in the three 


The popula- 
h that of the 


William C. Krathwohl 


remaining subjects. That is to say, a student 
who is industrious in English may or may not 
be industrious in mathematics, chemistry, OF 
physics. : 

In the case of the indexes of industriousness 
for mathematics and chemistry, the correlation 
coefficient between them is fairly small, 0.34, 
but is statistically significant at better than 
the 1 per cent level, 

In the case of the indexes of industriousness 
for mathematics and physics for sophomores, 
the correlation coefficient is small, and is not 
significant. However, the frequency, 37, of 
this group is small enough to cast doubts on 
the result. Hence, the entire procedure was 
repeated with 184 freshmen who had taken 
some locally prepared scholarship examinations 
in mathematics achievement and physics 
achievement. 

These 184 students later entered as freshmen 
and took the Iowa Mathematics Aptitude 
Test and the Iowa Physics Aptitude Test. 
The correlation coefficient, using these 184 
freshmen, between the I.I. for mathematics 
and the I.I. for physics turned out to be 0.18, 
which was not very different from the previous 
value of 0.26. However, the increased number 
of freshmen made this coefficient significant 
at almost the 1 per cent level. 

For further illumination on the independence 
of the various indexes of industriousness, the 
chi square method was resorted to, and the 
results are shown in Table 2. 

In Table 2 all the values of P, with the 
exception of the indexes of industriousnes> 


Table 2 


Values of Chi Square Between Various Indexes 
of Industriousness 


Degrees 
of 
i Free- 

Indexes N Yn pies E 
English with chemistry 270 513 4 Z 
English with physics 106 29 4 5 
English with mathematics 232 429 4&4 3 
Chemistry with physics 63 2.40 1 13 
Mathematics with physics 37 0.01 1 92 
Mathematicswithchemistry 191 15,75 4 0 
Mathematics with physics* 184 5.65 4 2 


* Freshmen. 


ety 


aa 


_€ach other with the exception © 


Over- and Under-Achievement in College Courses 105 


for chemistry with mathematics, are so much 
larger than 0.01 that the conclusion can be 
drawn that they are independent of each other 
as shown by Lindquist (7). The value of P 
for mathematics I.I. compared with chemistry 
LI., given as 0.01, is really less than that, and 
means that there is a relation between the 
LI. for mathematics and the LI. for chemistry. 
The existence of such a relation is borne out 
by the significant correlation coefficient of 
0.34 between indexes of industriousness for 
mathematics and for chemistry mentioned 
previously. However, this correlation is low 
enough for one to conclude that whatever 
relation exists must be a small one. 

The case for the indexes of industriousness 
for mathematics and physics is settled in 
Table 2. Here the high value of P, which 
equals 0.92 for the 37 sophomores, raises some 
question about the size of the sample of the 


` population and suggests that it may be too 


small. When the larger sample of 184 Le 
men was used, the value of P dropped to 0. a 
which is well within the range where independ- 
ence of the indexes of ee for 
mathematics and physics is assured. | 
On the whole, it can be said that the indexes 
of jindustriousness for English, chenn y 
mathematics, and physics are Dona dnt a 
Pected relationship between work habits in 
mathematics and chemistry. l 
noted Pee mathematics I.I. and ehemistey 
.I. pair is the only one out of six paepe 
pairs among English, chemistry, the 
and physics which differs markedly from 
Other five. The correlation coefficient between 
the two indexes for mathematics and af 
although low, is the highest of the six an 3 
the only coefficient which is statistically signif- 
icant. Furthermore, the mathematics-chem- 
istry pair is the only one between which l: he 
chi square test indicates a definite relationsh ip. 
A situation of this type needs further myene 
tion. It is possible that such a relations hip 
exists only among engineering sd r 
cause engineering students are known to dl E 
in some of their characteristics from liberal 
arts students and from some other professional 
groups, such as pre-law and pre-medicine, as 
was found by Fairbairn (1). Therefore, 2 
Study similar to this should be conducted on 


It should be- 


liberal arts students and repeated on another 
group of engineering students. It is also 
possible that the relationship between the 
chemistry I.I. and the mathematics LI. is 
due to the nature of the tests used to measure 
achievement in mathematics and in chemistry. 

An explanation of the ease of proving the 
independence between the I.I. for English 
and the I.I.’s for the sciences as compared with 
proving the independence of the I.I.’s among 
the sciences is seen by comparing the six 
inter-correlations among the four subjects. 
The correlations of English achievement with 
achievement in the three sciences, chemistry, 
mathematics, and physics, vary within the 
narrow range of 0.29 and 0.32, whereas the 
inter-correlations between achievements in the 
three sciences vary within the narrow range of 
0.54 and 0.56. All correlation coefficients are 
statistically significant. Whatever correla- 
tions exist at all between English and the 
sciences are probably due to a common factor 
associated with intelligence. The larger cor- 
relations between the sciences undoubtedly 
are due also to communality of subject matter, 
and probably it is this communality of subject 
matter which explains the difficulty in proving 
the independence of indexes of industriousness 
among the sciences. 

From this investigation it can be concluded, 
certainly for students in an engineering school 
and undoubtedly for others, that indexes of 
industriousness are specific instead of general 
because there exists at least a set of four 
subjects; English, chemistry, mathematics, 
and physics, which either are independent of 
each other or, if dependent, have only a small 
relationship. Hence, there is sufficient evi- 
dence to say that a student should not be 
considered industrious, as such, but rather 
that he is industrious in mathematics or 
English or whatever the field may be. Neither 
should he be considered indolent, as such, but 
rather that he is indolent in mathematics or 
English or whatever the field may be. In 
general, then, it is possible that a student might 
be industrious in mathematics, normal in 
English, and at the same time indolent in 
physics. It is also easily conceivable that the 
idea of specific industriousness may extend 
beyond the academic field into the fields, say 
of commerce and industry. j 


106 William C. 

Another conclusion that can be drawn is 
that if indexes of industriousness are compared 
in two fields which have a communality of 
subject matter, there is a possibility of a slight 
carry over of work habits from one field to 
the other, but such a possibility is small and 
sometimes it does not occur at all. 

The specificity of work habits of industrious- 
ness follows very strikingly along lines similar 
to that found by Hartshorne and May (2) 
in their studies of social habits, such as honesty, 
truthfulness, and morality. They found that 
these social habits were specific instead of 
general. That is to say, we cannot speak of 
an honest man, but rather of an honest act. 
For instance, a man may be honest in his 
income tax returns, but dishonest when he 
fails to report to the Lost and Found Bureau an 
article which he has found. In like manner, 
we cannot speak of an industrious individual, 
but rather we must say that an individual is 
industrious in some one area of activity, 
whereas he may be indolent or normal in 
another. 

Although this study has been done only on 
engineering students, it seems reasonable to 
assume that these conclusions should also 
hold for liberal arts students, although that 
fact needs verification. 

Because of the specific nature of work 
habits, it is possible that some of the diffi- 
culties which investigators have had with 
under-achievement is due to their attempt to 


cover too many diverse fields of study at the 
same time. 


Summary 


1. Certainly, as far as engineering students 
are concerned and undoubtedly for others, 
industriousness in any one of the four subjects 
—English, chemistry, mathematics, and physics 


Krathwohl 


—does not necessarily imply industriousness in 
any of the remaining three. A possible 
exception is one involving mathematics and 
chemistry, in which there is only a slight 
possibility that work habits in mathematics 
may be associated with the same kind of work 
habits in chemistry. 

2. An individual should not be considered 
industrious as such, but rather industrious mM 
mathematics or English or whatever the 
subject may be. Such information is partic- 
ularly valuable in a counseling situation where 
it should be remembered that it is possible 
for an individual to be industrious in math- 
ematics, normal in English, and indolent mM 
physics at one and the same time. 

3. As far as specificity is concerned, work 
habits of industriousness are very similar to 
the social habits of honesty, truthfulness, a 


morality which were investigated by Hart- 
shorne and May. 
Received May 14, 1951. 
References 
1. Fairbairn, Helen. Vocational interests. In E. 5. 


Jones (Ed.), University of Buffalo Studies, 1930, 
8, 61-65. 

2. Hartshorne, H., May, M. A., and Shuttleworth, 
F. K. Studies in the organization of character. 
New York: Macmillan Company, 1930. 

3. Krathwohl, W. C. The persistence in college of 


industrious and indolent work habits. J- educ. 


Res., 1949, 42, 365-370. 

4. Krathwohl, W. C. Effects of industrious and indo- 
lent work habits on grade prediction jn college 
mathematics. J. educ. Res., 1949, 43, 32-40. 

5. Krathwohl, W. C. An index of industriousness f0" 
English. J. educ. Psychol., 1949, 40, 469-481: 

6. Krathwohl, W. C. Relative contributions of vocab- 
ulary and an index of industriousness for English 

to achievement in English. J. educ. Psychol 
1951, 42, 97-104. 
. Lindquist, E. H. Statistical analysis in educational 


research. Boston: Houghton Mifin, 1940. Pp: 
41-43. 


The Role of Tests in the Medical Selection Program 


Ray B. Ralph and Calvin W. Taylor 


Department of Psychology, U niversity of Utah 


The problem of the proper selection of 
medical students from the ranks of applicants 
has long existed. In recent years this problem 
has been intensified by the increased number 
of applicants desiring admission. While this 
has made the selection ratio more favorable, 
it has at the same time required that medical 
selection committees spend considerable time 
and effort on the complex task of trying n 
identify the best prospects. Recent wor i 
events together with the current detem 
Program have intensified the importance o 
this problem. It is hoped that the preat 
article will focus more attention upon me 
Provide further insight into the medica 
selection problem. ; 

In 1946 the Moss Scholastic Aptitude Test 
was discarded by the Association of mee 
Medical Colleges in favor of the uaa 
Aptitude Test, which was renamed the Me ri 
College Admission ‘Test (MCAT), in October, 
1948, Hereafter, in order to avoid contgmon 
this officially used test will be called by a 
current name, Medical College Admission i t 
inasmuch as the only important change, 9 h r 
than renaming, made in the test E 
the periods herein reported was the addition 
of the subtest, Modern Society. 


The Medical College Admission Test 


Admittedly, evaluation studies on the Aa 
Medical College Admission Test have no T 
designed as well as would be desirable, pace y 
because all of the medical student samples 
studied have been selected at least partially on 
the basis of the test being evaluated. None- 
theless, one can study the role played by the 
MCAT scores in the complex medical selecaon 
Program by determining means and standar: 
deviations on recently selected medical classes. 

he mean and standard deviation for the 

CAT subtests on each year’s medical 
applicant population are 500 and 100, respec- 
tively. If considerable attention is given to 
a particular MCAT score in the selection 


process, then the medical class would be 
highly selected on that particular character- 
istic, as indicated by a high mean and a low 
standard deviation. On the other hand if a 
score is largely disregarded, then the sample 
selected will have a mean near 500, a large 
standard deviation, and will not differ greatly 
from the total medical applicant population 
on this characteristic. It was also decided to 
use medical academic success, the grade point 
average over the first portion of medical 
training, as a criterion and to validate the 
test scores against this criterion. 

The first reported study on the test was 
performed by Young and Pierson.’ Scores on 
the MCAT were correlated with first quarter 
medical college grade point averages for a 
sample of fifty freshman medical students. 
Results of this study are listed in the first 
data column of Tables 1, 2, and 3 under the 
heading ‘1947 Utah Class, First Quarter 
Criterion.” They further reported the grade 
point average for premedical science courses 
(one important basis for student selection in 
the program) to be most highly correlated 
(.50) with first quarter medical college grades. 

The present writers followed up this same 
sample of medical students further and deter- 
mined their scholastic success at the end of 5 
quarters and at the end of the four scholastic 
years (12-quarter accelerated program). The 
additional statistical findings on the 1947 
Utah class are also listed in Tables 1, 2, and 3. 
The 1948 and 1949 Utah classes were studied 
in a similar manner by the present writers with 
the measure of academic success in each case 
being the first medical year grade point 
average. These results are also listed in the 
three tables. Test scores of students in the 


1 Young, R. H., and Pierson, G, A. The Professionz 

Aptitude Test, 1947, a preliminary evaluation. J. tt 
A. M. Colls., 1948, 23, 176-179. These investigators 
also studied the Moss Scholastic Test, the Strong Voca- 
tional Interest Blank for Men, and the Minnesota 


ee Personality Inventory on the same medical 
class. 


107 


108 Ray B. Ralph and Calvin W. Taylor 


Table 1 


Means for the Medical College Admission Test 


1948 1949 Michigan Towa 
1947 Utah Class Class Class Class less 
7 7) 1 Ye 

Test Score 1stQtr. 5Qtrs. 4 Yrs. I Yr. L Yr. i Mr. = 

Scientific Vocabulary 576.2 579.8 579.8 571.2 548.6 ac ee 
Social Vocabulary 534.2 530.2 530.2 508.8 520.6 550. 296 
Humanistic Vocabulary 542.0 536.8 536.8 507.1 521.0 sot a5 
Composite Verbal Ability 549.0 551.6 551.6 532.2 525.4 56 1. or 
Quantitative Ability 528.8 527.0 527.0 518.2 573.1 aioe 347 

Index of General Ability 552.8 548.9 548.9 529.8 546.4 575.2 ak 
Modern Society — = — 520.7* 517.7 — : aa 

Premedical Science Achievement 588.0 591.4 591.4 588.0 583.3 606.5 
5 si 
Number of Students Sampled 50 44 44 51 52 102 loss 
*N = 42 students. 
Table 2 


Standard Deviations for the Medical College Admission Test 


1948 1949 Michigan he 
1947 Utah Class Class Class Class Bass 
. 7, 
Test Score ist Qtr. 5Qtrs. 4 Yrs. 1 Yr. 1 Yr: 1: Vi 1 Yr. 
Scientific Vocabulary 714 168 168 87.4 73.2 85.6 Ss 
Social Vocabulary 43 728 728 76.9 72.8 90.7 i 
Humanistic Vocabulary 83.8 851 85.1 74.8 86.4 83.0 7 P 
Composite Verbal Ability 749 74.2 742 69.9 67.1 81.3 2- 
Quantitative Ability 70.7 70.7 707 82.3 88.6 80.4 mo 
Index of General Ability 70.6 704 704 70.4 64.5 80.3 66. 
Modern Society — — — 76.8* 71.2 — i 
Premedical Science Achievement 84.7 764 764 65.8 66.7 74.4 18: 
Number of Students Sampled 50 44 44 51 52 102 81 
ae 
«N= 42, 
Table 3 


Validity Coefficients for the Medical College Admission Test 


p 1948 1949 Michigan Jows 

1947 Utah Class Utah Class Utah Class Class os 

Test Score Ist Qtr. 5Qtrs. 4 Yrs. 1 Yr. 1 Yr, 1 Yr. yie 
Scientific Vocabulary 23 ty 16 22 21 38 
Social Vocabulary -08 97 “40 = 03 E 05 2 
Humanistic Vocabulary —.22 —.06 — 02 — 03 — 23 ‘03 21 
Composite Verbal Ability —.10 06 07 07 13 ‘10 38 
Quantitative Ability 19 a6 23 20 26 23 H 
Index of General Ability .02 .08 .08 -14 — 02 15 34 
Modern Society — = a .10* ‘03 pees T 
Premedical Science Achievement .24 .26 23 -18 19 26 48 
Number of Students Sampled 50 44 44 51 52 102 3l 

* N = 42 students. 


Role of Tests in Medical Selection Program 


same class who had taken different forms of 
the Medical College Admission Test were 
lumped together, the different forms being 
considered identical in the statistical treatment. 
This assumption, however, did not particularly 
alter the results found on a subsample of 
those who took only the same form. Since 
only 42 of the 51 students in the 1948 Utah 
class took the newer form with „the Modern 
Society subtest, results for this subtest were 
based on the reduced sample. ; 

The results of two unpublished studies, ag 
on a Michigan medical class by R. M. a 
Travers and the other on an Towa medica 
class by the University Examination Serv ice, 
are also listed in Tables 1, 2, and 3. Similar 
to the findings of Young and Pierson, the, Toye 
study indicated that the grade point av empe 
in premedical science classes with a vali ity 
Coefficient of .55 was a better predictor at 
first year medical success than was any part a 
the MCAT. It was also found that, m ene 
Scores on this 1947 form correlated higher with 
grades already attained in premedical — 
than with the subsequently achieved success 
the first year of medical college. 

An apreactish of the means and standard 
deviations of the MCAT subtests on 
Teported studies affords some insight a the 
selection procedure utilized on each class. 
From the results in Tables 1 and 2 it sem 
evident that more attention was given to 
Scores in some of the ne than others in 
the selection of the medical classes. g 

An REA of Table 3 shows the valiy 
Coefficients for the various subtests across 
the samples were generally low and in sev Ka 
Cases were essentially zero. Individual e 
tests in certain of the studies occasiona y 
Showed meaningful validities but in py 
Cases this result was counterbalanced by 
essentially zero validities in the other anes 
n terms of medical academic success Scientific 
Vocabulary, Quantitative Ability, and a 
medical Science Achievement appear to pe 
the only consistently valid subtests of ie 
MCAT, even after considering restriction © 
Tange, e 

In summary, on five samples of m 
Students from three universities, it 1s eviden 
that the medical students are more highly 
Selected on certain MCAT subtest character- 


109 


istics than on others. Many of the subtests 
in the MCAT have shown little evidence of 
being valid as predictors of medical academic 
success. It may be possible, however, that 
some of these subtests (e.g., Modern Society) 
are valid for some desirable purpose or purposes 
other than the prediction of medical academic 
success. If any of the subtests were developed 
for other purposes, it would be advisable to 
define these purposes clearly so that studies 
could be designed to see how well the subtests 
achieve these other goals, 


Evaluation of Some Other Aptitude Scores 


It was decided to attempt the validation of 
some other promising aptitude scores to 
determine if they were related to medical 
academic success. The General Aptitude Test 
Battery (GATB) was made available for this 
experimental study. This battery, developed 
for use in public employment office counseling 
programs, consists of 16 tests which yield 11 
aptitude scores (Letter Series, Test “E,” was 
treated as measuring a separate aptitude). 
Most of these 11 aptitude factors are well. 
known. Measures of identical or similarly 
named aptitudes are found in several other 
test batteries and particularly in factorial 
research studies. 

The GATB aptitude scores were established 
so that the mean score for the worker popula- 
tion is 100 and the standard deviation is 20.2 

The 1947 Utah class was tested with the 
GATB after they had completed five quarters 
of medical training. With the exception of 
five persons, all of the forty-nine medical 
sophomores tested had been selected for 
medical training partly on the basis of their 
scores on the MCAT. The means, standard 
deviations, and validity coefficients for the 
GATB aptitudes against the five-quarter grade 
point average are listed in Table 4. These 


2 For further information abou 
Test Battery, see the following references: (a) D k 
Beatrice J. The new USES General EA Test 
Battery. J. appl. Psychol., 1947, 31, 372-376; (b) Staff, 
Diy. of Coen onal eles, WMC, Factor analysis 
of occupational aptitude tests. Educ. psychol. Measmt. 
1945, 5, 147-155; (9 GATE Senia Project Staff, Uni- 
versity of Utah, etal. General Aptitude Test Battery 
patterns for college areas. Occupations, 1951, 29, 518- 
526; and (d) Petrullo, L., Cohen, I. K., and Meigh, C. 


The Employment Service testi . 
Secur. Rev., 1949, 16, 19. E ee 


t the General Aptitude 


110 


Table + 


Means, Standard Deviations, and Validities for the 
General Aptitude Test Battery on the 1947 
Utah Class (N = 49) 


Validity 
Coefficient 

Standard —————— 

GATB Score Mean Deviation 5Qtr. 4Yr. 
Intelligence (G) 143.0 11.9 AT 4 
Verbal (V) 137.6 14.4 45 42 
Numerical (N) 132.6 12.8 39 58 
Spatial (S) 128.0 10.7 Al Eyd 
Form Perception (P) 126.3 13.7 12 Aid 
Clerical Perception (Q) 123.0 20.0 14 19 
Test “E” 129.1 TE? —.06 -10 
Aiming (A) 107.2 23.8 =I —04 
Motor Speed (T) 98.9 25.5 OL 3, 
Finger Dexterity (F) 97.5 16.6 —.01 13 
Manual Dexterity (M) 109.8 29.9 —.06 05 


students were then followed up through 
graduation and new validities computed against 
the grade point average for the total four-year 
training program. These results are also 
presented in Table 4. 

The method of testing persons who have 
completed training and on whom criterion 
scores of success are already available is often 
utilized to make a rapid evaluation of apti- 
tudes. When this method is used, as in the 
present study on the GATB (particularly in 
the case of the five-quarter criterion), it is 
highly advisable to conduct additional studies 
in order to check the results on the initial 
study. These studies should preferably be of 
the follow-up type in which persons would be 
tested prior to medical training (but not selected 
on the basis of these experimental test results) 
and then followed up to ascertain their even- 
tual degree of success in training. 

As in the case of the MCAT it appeared 
wise in searching for valid aptitudes to use a 
multiple evaluation approach in examining 
the results for the GATB. The things 
considered for each aptitude were the mean 
standard deviation, and validity coefficients 
together with a judgment of whether or not 
it makes psychological sense to identify that 
aptitude as important in medical academic 
success. In terms of this multiple evaluation 
the first four aptitudes in Table 4 i 


» namel: 
General Intelligence (G), Verbal (V), ‘Gamer 


Ray B. Ralph and Calvin W. Taylor 


ical (N), and Spatial (S), were considered 
sufficiently valuable to warrant further serious 
consideration, whereas the other aptitudes 
were judged to be either of borderline value 
or of no value with regard to success in medical 
training. i 


Discussion 


A reduced battery yielding four aptitude 
scores, G, V, N, and S, can be administered in 
Jess than 45 minutes, of which 29 minutes 15 
actual testing time. This time is 1/7 as long 
as the total testing time of 6 hours and 45 
minutes for the current MCAT. Even though 
restriction of range was strongly evident on 
all these GATB aptitudes, a multiple correla- 
tion coefficient of .56 was obtained for this 
reduced battery against the five quarter crite- 
rion. A higher multiple correlation coefficient 
of .60 was found for the 4-aptitude battery 
against the four-year criterion. 

From the above results it appears that the 
four aptitude combination competes favorably 
with the MCAT both in predictive value and 
in testing time required. Although no direct 
comparison is possible because of different 
standardization populations it can be seen 
from all the results presented that there is some 
restriction of range (with subsequent effect 
on the size of validities) on the four GATB 
aptitudes as well as on the MCAT subtests. 
At the same time it should be noted that the 
premedical science grade point average has 
often been found to be the best, or one of the 
best, predictors of medical academic success. 
It appears likely that these results were 
obtained despite the handicap of restriction 
of range resulting from the important role the 
premedical science grade point average plays 
in many selection programs. 

In the 1949 form of the MCAT only one 
Verbal Ability score was given so that the 
profile contained 5 instead of 8 subscores- 
One wonders what the correlation would be 
between this single Verbal Ability score and 
the four verbal scores previously reported on 
the MCAT. Is it the same as the composite 
score or is it identical to one of the Vocabulary 
subtests? On the surface, this Verbal Ability 
is apparently somewhat different in composi- 
tion from the previous composite Verbal 
Ability score and from the three MCAT 


mei 


Role of Tests in M edical Selection Program 111 


Vocabulary scores that have been evaluated 
here. The latest Verbal Ability score JS 
described as a composite taken from a vocab- 
ulary section and a reading comprehension 
section. a 

More recently the Index of General Ability 
Score has been dropped, leaving only 4 scores 
m the MCAT profile: V erbal Ability, Quantita- 
tive Ability, Modern Society, and Science. 
The reduction of the number of scores in the 
profile was for simplicity reasons and in this 
simplification process, most of the subtests 
that were poor predictors of the present 
criterion, medical academic success, were 
eliminated. The Modern Society subtest is 
the only subtest in the current MCAT that is 
Primarily symbolic of the need for other well 
defined criteria of medical success. However, 
if not much attention is paid to this subscore 
in the actual selection program, then it is not 
Playing its designated role well. A way of 
More certainly insuring that all medical 
doctors have a prescribed knowledge of 
Modern society would be to require an appro- 
Priate training course instead of having 
applicants take a test, the results of which 
Might, in practice, be somewhat ignored in the 
Complex medical selection program. 

The relationship between all types of scores 
found to have significant scholastic predictive 


value, such as the premedical science grade 
point average, the four aptitude scores, and 
certain subtests still in the MCAT, should be 
investigated. Of particular interest would be 
the relationship between the GATB Verbal 
Aptitude score and the most recent Verbal 
Ability score. Unfortunately, these two sets 
of scores have not as yet been obtained on the 
same sample. Furthermore, the particular 
combination of all these scores that will yield 
the maximum validity should be determined 
so that the best composite battery of valid 
measures can be available for use in predicting 
scholastic success. 

It is very likely that the best combination 
of the previously mentioned predictors 
would still leave a sizable fraction of medical 
academic success untouched. This would 
undoubtedly be also true for any other sug- 
gested criterion of medical success. Further 
research is therefore clearly needed to inves- 
tigate the value of other parts of medical 
selection programs and to develop devices and 
procedures that will get at additional character- 
istics important in total medical success. It 
is also suggested that any new devices and/or 
procedures be thoroughly evaluated by means 
of well designed studies before they are widely 
installed. : 


Received May 25, 1951. 


Faking Personality Test Scores in a Simulated Employment Situation 


Alexander G. Wesman 


The Psychological Corporation, New York City 


It has been the experience of most industrial 
psychologists that personality and interest 
inventories are ineffective when used for 
selection purposes (1, 2, 3, 4, 6, 7,8). Ordinar- 
ily, many of the items can be seen through by 
most applicants, and the appropriate response 
given. The stereotypes which many employ- 
ment officers seek (e.g., aggressive, self-confi- 
dent salesmen) are also the stereotypes which 
the applicant expects the employer to be 
seeking. He is therefore all too likely to 
respond accordingly. 

The data reported herein were collected in 
the course of a teaching demonstration. The 
author wished to impress a group of extension 
students at a large university with the untrust- 
worthiness of personality inventories in 
employee selection. He gave the Bernreuter 
Personality Inventory to a group of 85 students 
with about the following instructions: 


“I want you to pretend that you are applying 
for the position of salesman in a large industrial 
organization. You have been unemployed for 
some time, have a family to support, and want 
very much to land this position. You are being 
given this test by the employment manager. 

- Please mark the answers you would give.” 


The following week, at the start of class, the 
Same inventory was again distributed to the 
class, with the following instructions: 


"You are now applying for the position of 
librarian in a small town. You need the em- 
ployment to support your family and meet 


financial obligations. Please mark the answers 
you would give.” 


Both administrations of the inventory 
occurred before there’ was any discussion of 
the field of personality measurement, The 73 
students who took the test twice were a very 
heterogeneous group in age, academic back- 
ground, industrial experience, and test sophis- 
tication. On the latter variable, they ranged 
from a young lady taking her first course since 
high school, with almost complete innocence 
of the test field, to a young man about to 


112 


receive a Ph.D. in measurement, with several 
years of professional experience behind him. 
Table 1 presents the score distributions 
obtained from these two administrations of the 
inventory for one of the measured traits, 
Self-Confidence (Scale F-1) (5). The table 


Table 1 


Students’ Scores on a Self-Confidence Scale in 
Two Simulated Employment Situations 


Employment Situation 
Self-Confidence es 


Scale 


Salesman Librarian 


Raw Score* 

260-241 1 
240-221 2 
220-201 18 
200-181 27 
180-161 11 
160-141 
140-121 
120-101 
100- 81 
80- 61 
60- 41 
40- 21 
20- 1 


Minus Values 


nene 
WVAN 


0- 19 
20- 39 
4U- 59 
60- 79 
80- 99 

100-119 
120-139 
140-159 
160-179 
180-199 
200-219 
220-239 
240-259 
260-279 
280-299 


Plus Values 


PONE NHPE PR RWH HWW 


1 


Total 73 73 


* s r 
Minus scores represent greater self-confidence. 


BE 


Faking Personality Test Scores 113 


speaks eloquently for itself. If one saw these 
distributions without foreknowledge of how 
they were obtained, he could only conclude 
that they represented two quite different 
groups of people. The first column, Sales- 
man,” is apparently composed of people who 
are, with three exceptions, above average 
in self-confidence. The second group, “Librar- 
ian,” seems to contain almost as many kela 
average people on this trait as above-averag 
(34 and 08 semecavels Those at the fifth 
percentile of the first group are more self- 
confident than the “applicants” at the fiftieth 
percentile of the second group. It is hard to 
realize that these “two” groups are really one 
and the same, except that the positions for 
Which they are pretending to apply are 
different. E. 
The demonstration is, of course, artificial. 
These are not true applicants. They are 
Students pretending that they are applic: 
nquestionably, some of them are more er 
wise (and stereotype-wise) than the amen 
real applicant. Nonetheless, the demons = 
tion seems to the author sufficiently drama F 
to point up the susceptibility to faking 0 
Personality inventories in the industrial situa- 


tions. ‘Teachers who have not already used 
similar demonstrations with their students 
will find this approach rewarding. 


Received June 4, 1951. 


References 


en 


. Benton, A. L., and Kornhauser, G. I. A study of 
“score faking” on a medical interest test. J. 
Ass. Amer. Med. Coll., 1948, 23, 57-60. 
2. Bordin, E. S. A theory of vocational interests as 
dynamic phenomena. Educ. psychol. Measmt., 
1943, 3, 49-65. 

3. Cofer, C. N., Chance, June, and Judson, A. J. A 
study of malingering on the MMPI. J. Psychol., 
1949, 27, 491-499. 

4, Ellis, A. The validity of personality questionnaires. 

Psychol. Bull., 1946, 43, 385-440. 


5. Flanagan, J. C. Factor analysis in the study of per- 
sonality. Stanford: Stanford University Press, 
1935, Pp. 103. 

6. Hunt, H. F. The effect of deliberate deception on 
Minnesota Multiphasic Personality Inventory 
performance. J. consult. Psychol., 1948, 12, 396- 
402. 

7. Longstaff, H. P. Fakability of the Strong Interest 
Blank and the Kuder Preference Record. J. 
appl. Psychol., 1948, 32, 360-369. 

8. Paterson, D. G. Vocational interest inventories in 

selection. Occupations, 1946, 25, 152-153. 


The Relationship Between Ortho-Rater Tests of Acuity and 
Color Vision in a Senescent Group 


Robert W. Kleemeier 


Moosehaven Research Laboratory, Orange Park, Florida 


In a recent report on Ortho-Rater norms 
and sex differences Ely, Kephart, and Tiffin 
(2) noted that a sample of 7,597 male and 
2,457 female industrial employees showed an 
unexpected difference in color vision scores. 
These authors say, “It will be noted that in the 
color vision test a difference in favor of the 
men was found. This difference was signif- 
icant at the 1% level. The authors are aware 
that this finding is contrary to long accepted 
theories and facts about the distribution of 
color blindness among the sexes. The explana- 
tion for this difference in findings is not 
known.” 

Presented below is evidence gathered from 
tests administered to a group of aged male 
subjects, which, we believe, provides the 
explanation to the above mentioned dilemma. 
This evidence seems to indicate that the 
answer lies not in the realm of color vision, but 
rather stems from the fact that in the industrial 
sample studied women had significantly poorer 
distance acuity than men. Thus, their poorer 
performance on the color test is, perhaps, 
simply a reflection of their poorer visual acuity, 


Method 


Subjects in our study were 128 male residents 
a fraternal home for the aged. Table 1 
shows the age characteristics of thi 


in 


1 s group. 
The tests were given as a part of a routine 
battery administered to residents of the home. 
Table 1 
Age Distribution of Subjects 

Group Age N Mean Age 

A 65-70 24 67.6 

B 71-75 40 72.9 

c 76-80 44 77.7 

D 81-85 20 82.6 

Total 128 75.1 


Far distance binocular acuity test scores and 
color test scores were available on 123 of the 
total group tested. In addition, paired ien 
distance acuity and color scores were obtained 
on 127 of this group. 

All tests of visual performance were made on 
the Ortho-Rater under standard conditions 
(3). Our aim in giving these tests was to 
measure the quality of visual performance 
exhibited by the subject at the time of testing, 
consequently, subjects who customarily wore 
corrections were tested while wearing ma 
At the completion of such tests, measures 0 


> 5 ‘ 3 eo 
visual acuity were obtained without corre 
tions. 


Results 


A product-moment r of .675 was obtained 
between the Ortho-Rater color test (F-7) an 
the far distance binocular acuity test (F-3). 
Using z transformations, the 1% fiducial limits 
of this correlation are .786 and .523. N is, 
of course, 123. f 

A somewhat lower but still significant 7 © 
«487 was obtained betweea results on the near 
acuity test (N-1) and the color test. The 1% 
fiducial limits of this r (N=127) are .646 an 
-295. Since the color test is given at the far 
distance this lower r with near acuity is te 
be expected. om 

To round out the intercorrelational possibil- 
ities presented here, we find an r of -569 
between the acuity tests at the two distances- 
With an N of 85, the 1% fiducial limits for 
this 7 are .335 and .724. j 

Because of the relatively poor visual acuity 
in our group, the relationship between perfor- 
mance on acuity and color tests was imme- 
diately obvious to the examiner, Those who 
had great difficulty with the test objects 17 
the acuity tests regularly exhibited difficulty 
not only with the color test objects but with 


all other visual tests in the battery. It was 
114 


Relationship between Ortho-Rater Test and Color Vision 


this observation which led us to correlate 
visual acuity and color. 

In Figure 1 the median far distance binocular 
visual acuity scores for the four age groups 
shown in Table 1 are given. It will be noted 
that the senescent group has considerably 
poorer preformance than the younger industrial 
group (2). Thus, on the Ely, Kephart, and 
Tiffin norms, the median scores for our four 
age groups on Test F-3 would be as follows: 
(A) eighth, (B) sixth, (C) third, (D) second 
Percentile. These scores show somewhat 
dramatically the amount of deterioration which 
has taken place in the visual acuity of this 
Particular senescent group. It seems, how- 
ever, that the visual performance of the great 
Majority of these men is adequate for the 
demands made upon them. 


Discussion 


In view of our findings, the explanation of 
the poor visual performance of the women in 


SCORE (F-3°) 


ORTHO-RATER 


A B Cc D 
AGE GROUP 
Fic, 1 i i it 
F Median far di binocular visual acuity 
Si n far distance bini = 
indi’ £0", age groups shown in Table 1. Quartiles 


indi 
of ted by dotted lines. For an Ortho-Rater score 


© equivalent Snellen notation is 20/20. 


115 


the industrial group on the color vision test 
seems obvious. Ely, Kephart and Tiffin note 
that the mean score for males on the color test 
(F-7) was 5.08 and the mean for females was 
4.68 on this test. They also show that the 
mean score for far distance binocular acuity 
(F-3) for their male population was 10.69 and 
for the female population was 9.64. This 
difference of 1.05 is significant at the 1% level. 
In view of our finding of a correlation of .675 
between far distance binocular acuity and the 
color perception test, it is not at all surprising 
that the women in this particular industrial 
sample scored lower than the men on Test F-7 
(color). Thus, it would seem that the major 
reason for their lower score was not a defi- 
ciency in color perception but rather a 
deficiency in visual acuity. They simply 
couldn’t see the color chart as well as could 
the men. 

These results also have bearing upon an 
observation made by Boice, Tinker and Pater- 
son (1) who obtained in a small male sample 
(N=40), age 60 years or older, an unusually 
high percentage of color blindness (20%). 
This evidence, they state, suggests the possibil- 
ity “. . . that, with advanced age, changes in 
the retina, the optical nerve or the visual 
cortex occur in an unusually high percentage 
of cases.” Here, too, the factor of visual 
acuity needs control before we speculate too 
much upon the existence of a special deteriora- 
tion of color vision with age. 

Tiffin (4, p. 225) has also noted a diminution 
of color sensitivity with age. Using Ortho- 
Rater data gathered on an industrial sample 
of over 10,000 men and women, he observed 
that “. . . after age 45 both sexes show a loss 
in color vision. In an earlier report . . . it 
was shown that decreases in color vision 
began by age 25. Both studies agree that 
color vision deteriorates with advanced age.” 

In view of possible contamination of these 
results with uncontrolled visual acuity, these 
reported age trends in color vision are open to 
question. Thus, it would appear that any 
attempt to ascertain the relationship between 
color vision and age can be successful only if 
visual acuity is somehow controlled. This is 
particularly true when pseudo-isochromatic 


116 Robert W. Kleemeier 


color tests such as the Ishihara or the Ortho- 2. Ely, J. H., Kephart, N. C., and Tiffin, J. Ortho- 


Rater are used. Rater norms and sex differences. J. appl. 
r Psychol., 1950, 34, 232-234. 
ed January 11, 1952. , SAN : 
Pearly publicalton. 3. Standard practice in the administration of the Bausch 
& Lomb occupational vision tests with the Ortho- 
References Rater, Rochester, N. Y.: Bausch and Lomb, 
1. Boice, Mary L., Tinker, M. A., and Paterson, D. G. 1944. 


Color vision and age. Amer. J. Psychol., 1948, 4. Tiffin, J. Industrial psychology. (2nd Ed.) New 
61, 520-526. York: Prentice-Hall, Inc., 1947. 


Note on Table for Use With Spearman-Brown Formula 


Lee W. Cozan 
Hechinger Company, Washington, D. C. 


B In order to facilitate the use of the Spearman- 
rown prophecy formula, the writer has 
anes a table that shows the effects of 
porary the number of independent measure- 
nm upon the reliability coefficient. The 
A le is simple to use. The table is entered 
ey by the original reliability coefficient 
nd horizontally by the number of times the 
measure is increased. 

For example, if the reliability coefficient of 
eo reduce printing costs the table has been deposited 
i the American Documentation Institute. Order 
ie ent 3308 from American Documentation Insti- 
mitti 1719 N Street, N.W., Washington 6, D. C., re- 
sta ting $1.00 for microfilm (images 1 inch high on 
Diadan 35 mm. motion picture film) or $1.00 for 

otocopies (6 X 8 inches) readable without optical aid. 


a twenty minute employment test is 0.50 
increasing the length of the test to one hour 
should increase the reliability coefficient to 
0.75. If the reliability coefficient of perform- 
ance ratings made by one supervisor is 0.75 
the pooled ratings of five raters should be 0.94. 
This table permits rapid and accurate deter- 
mination of the reliability coefficient and 
Sage es = calculations previously involved 
in the application o 
einer OA. f the Spearman-Brown 
It is hoped that the applicability and 
aig the table will be revealed by future 


Received June 1, 1951. 


Editor’s Note: At the page proof stage the Editor 


discovered ti 
nomograph for the Sı 
published by Dunlap, 
Handbook of statistical 
by the World Bool 
aware of this, this a 
—Editor. 


o his mortification that a table and a 
pearman-Brown Formula were 
J. W. and Kurtz, A. K., in 
nomographs tables and formulas, 
k Company in 1932. Had he been 
rticle would not have been accepted. 


117 


The Scaling of Stimuli by the 


Allen L 


Method of Successive Intervals * 


. Edwards 


The University of Washington 


We are sometimes faced in psychological 
research with the problem of ordering a set of 
stimuli or objects on a psychological continuum 
when the relative positions of the same stimuli 
on a physical continuum are unknown. 
Suppose, for example, that we have available a 
set of n stimuli. We assume that these stimuli 
possess varying but unknown degrees of some 
defined attribute. We wish to define opera- 
tionally a psychological scale for this attribute 
and to determine the values of the stimuli on 
the defined scale. 

Applying Thurstone’s (6, 7) well-known law 
of comparative judgment to data obtained by 
the method of paired comparisons provides 
one solution to the scaling problem. The 
method of paired comparisons, however, re- 
quires n(n—1)/2 judgments for the » stimuli. 
It is obvious that the method is experimentally 
impractical when the number of stimuli to be 
scaled is large. Twenty-five stimuli, for 


example, would require 300 comparative judg- 
ments from each subject. 


Method of Successive Intervals 


In the present paper we shall describe an 
alternative method of scaling which possesses 
the following Properties: (1) the method re- 
quires but a single Judgment from each subject 
for each stimulus; (2) the method yields 
scale values which are linearly related to those 
obtained by the method of paired comparisons; 
(3) the method provides its own internal con- 
sistency check upon the validity of the various 
assumptions made; and (4) the computations 
involved are quite simple. The theoretical 
development of this method of scaling, which 
we shall call the method of Successive intervals, 
has been described elsewhere (2). 

The basic data are obtained in the form of 
judgments or ratings of each stimulus in terms 
of successive intervals or categories represent- 


* This paper was prepared while the writer was a 
post-doctoral Research Training Fellow of the Social 
Science Research Council. 


ing increasing amounts of the defined attribute. 
No assumption, such as that involved in the 
method of equal-appearing intervals (8), is 
made concerning the widths of the successive 
intervals. The only requirement is that each 
successive interval represent an unknown but 
additional amount of the attribute. 

It is in the nature of the scaling problem to 
determine the widths of the intervals making 
up the psychological continuum. We make 
the assumption that the judgments for each 
stimulus are normally distributed on the 
unknown psychological continuum. The scale 
values of the stimuli are then defined as the 
means of the distributions of judgments as 
projected upon the psychological continuum. 

For purposes of illustration, we shall use 
data reported by Saffir (5).!_ In Saflir’s study, 
subjects judged the extent to which they would 
like to associate with various nationalities. 
Ten rating categories were used. We have 
rearranged Safirs data so that the first 
category represents nationalities which the 
subjects would least prefer to associate with 
and the last category represents nationalities 
which the subjects would most prefer to 
associate with. From the frequency distribu- 
tions of ratings; we obtain the cumulative 
distributions of Table i 

The matrix of Table 1 is of order axXr where 
n is the number of stimuli and r is the number 
of categories. Let the general element of this 
matrix be p;p. Any element Pir will then show 
the proportion of subjects placing a give? 
stimulus j in the kth category or below. The 
values 1—p}, will show the proportion ° 
subjects placing stimulus j above the Ath 
category. All Subsequent calculations até 
based upon the data of Table 1. They ca” 
be described in terms of a series of matrices. 

The scale values of the stimuli are unknown: 

1 F i 
fated in Salk = 5 Era 
Three additional nationalities 


the technique described, 
discussed later. 


118 


ve of the nationalities 
t reported upon here: 
could not be scaled ÞY 
The reason for this will Þe 


kaa 


Scaling of Stimuli by Method of Successive Intervals 


119 


Table 1 


a ies * = 
Cumulative Distributions of Judgments for Nationality Preference Data’ (N = 133) 
umulative ™ 


Least Preferred 


Most Preferred 


a 3 4 5 6 7 8 9 10 
Nationality 1 2 = 5 T 3 FA FS ry 1.00 
1. Austrian O, O y ott 32 62 95 1.00 
2. Belgian 0 00 -g oz o3 05 a8 85 84 100 
3. Frenchman 02 oe ‘00 01 02 .05 .08 29 63 1.00 
4. German 00 00 "9 46 65 .80 89 96 1.00 1.00 
5, Greek dE O ni o2 04 10 25 59 92 1.00 
6. Hollander 00 W a o G D a a 65 100 
7. Irishman 00 OL z as 35 50 70 89 98 1.00 
8. Italian :01 R 50 65 78 9 93 99 1.00 100 
9. Japanese 1 = r 56 77 85 92 98 1.00 1.00 
10. Mexican 0S a a 1 80 s5 88 94 98 1.00 1.00 
11. Negro 47 w o 0 04 05 20 56 93 1.00 
12) Norwegian OO > it 7o 3l 4 62 83o %2 98 1.00 
13. Pole %% o 0 W W G(s 62 100 
14, Scotchman 00 a 14 23 41 .60 76 87 97 1.00 
15. S. American .01 -08 E 09 20 .42 70 .92 .96 1.00 
oo s 3 
16. Spaniard 00 ‘OL 02 05 06 14 23 .50 .86 1.00 
17. Swede OL 


* After Saffir (5). 


jstributi of 
Assuming, however, that the dirin E 
judgments are normal on the i. prene 
Continuum, the boundaries of ea ie 
Can be expressed as normal devia ce tk 
table of the normal probability sa 
entered with the value 1—psm the eg a “A 
ing normal deviate will be the has dhe Mi 
the kth category (or the lower = a 
> etgan: Tie Ai ies a the upper 
Or example, provides estimates a a. 
nuite of categories 4, >, AE boundaries 
Pressed as normal deviates, a ani £33, 
are —1.64, —1.23, —47, .08, -67, 
Tespectively. n ae 
Each stinaitus will provide an estimate of on? 
Or more boundaries. These estimates Sone 
‘he Xip matrix which, of a ee nalts 
of order larger than »X(r— 1). ae AL 
Whose frequencies are distributed ov T mis 
Categories, for example, may provide es to aE 
of 7—1 boundaries? Itis important 
” he categories, 
yale, dejeemining the bound a han Om 


y only 
be ignored. Such values would be determined PY q as 
Small number of observations and are 
Unreliable, 
£ 2 estimate can be obtained of th 
Of the rt category and no estimate can 
boundary of the first category: 


er boundary 
ete obtained o! 


n 
the lower | ayant 


that the X; values can be obtained without 
any reference to the precise location of the 
scale values of the stimuli. 

Since the cell entries of the Xj, matrix corre- 
spond to upper limits of the kth intervals (or 
the lower limits of the kth+-1 intervals), the 
differences X jx31—X,j, will provide estimates 
of the widths of the successive intervals. 
For the first stimulus, Austrian, these succes- 
sive differences are .41, .76, .55, .59, and .67. 
These are estimates of the widths of intervals 
5, 6, 7, 8, and 9, respectively. Obtaining the 
similar differences for each of the other stimuli, 
we have a matrix in which the entries of each 
column are estimates of a common interval. 
We assume that the best estimate of the 
interval width is given by the mean of the 
column entries.‘ The obtained means are 
.38, .40, .42, .41, .45, .52, .78, and 1.04. They 
represent the widths of intervals 2, 3, 4.55 6; 
7, 8, and 9, respectively. Cumulating the 
means for the successive intervals, we have 
50 of the judgments for a given stimulus fall in either 
of these categories, the stimulus cannot be scaled by 
the method described. It is for this reason that the 
three nationalities mentioned earlier were omitted. 


* The calculations up to this point are the same as 


those described by Attneave (1) for his method of 
graded dichotomies. 


120 Allen L. Edwards 
Table 2 
Theoretical Cumulative Distributions Obtained from Scale Values and Interval Widths E > 
Least Preferred Most Preferred 
E Cumulative Interval Widths 

== 38 78 1.20 1,61 2.06 2.58 3.36 4.40 p: 
Scale Values of = = a 10 
Nationalities 1 2 3 4 5 6 a 8 2 ES —; 
(2.51) 1. Austrian 01 02 u 09 B B3 .53 80 97 po 
(3.04) 2. Belgian -00 -00 01 .03 -08 16 32 62 D1 10 
(3.48) 3. Frenchman -00 00 .00 01 03 08 18 45 .82 ab 
(3.99) 4. German 00 00 .00 00 01 03 08 .26 66 w 
(1.29) 5. Greek -10 -18 31 46 62 78 90 98 1.00 ee 
(3.16) 6. Hollander .00 00 O01 02 .06 14 28 58 89 1. 4 
(3.89) 7. Irishman .00 -00 .00 .00 01 03 09 30 69 os 
(2.07) 8. Italian 02 05 10 19 32 50 70 90 9 1 
(..77) 9. Japanese .22 35 .50 67 80 .90 96 1.00 1.00 nh 
(1.06) 10. Mexican 12 25 .39 56 71 84 94 99 1.00 1. a 
(07) 11. Negro 4AT 62 76 87 94 98 99 100 100 10 
(3.23) 12. Norwegian 00 00 OL 02 05 32 26 55 88 ro 
(1.69) 13. Pole .05 .09 18 31 A7 -64 81 95 1.00 1. 5 
(4.07) 14. Scotchman -00 .00 00 .00 01 .02 .07 .24 63 10 
(1.83) 15. S. American .03 .07 -15 .26 Al 59 17 94 99 m 
(2.21) 16. Spaniard 01 .03 08 16 270 Ad 64 88 .99 1 o 
(3.35) 17. Swede 00 .00 01 .02 04 10 .22 50 85 1.0 


the common psychological continuum for all 
stimuli. 

With knowledge of the psychological con- 
tinuum, it is a simple matter to find the scale 
values of the stimuli. In terms of our earlier 
discussion, they will be the medians of the 
distributions of judgments as projected upon 
the psychological continuum, They may be 
computed by formula, interpolating within 
a specified interval to find the point below 


which and above which 50 per cent of the 
judgments fall. 


Internal Consistency Check 

We have placed no res 

distributions of Table 1, o 

entries in the last colum: 

We thus have n(r— 
pendent entries 

available the n= 


trictions upon the 
ther than that the 
n must equal 1.00. 
~1)=17(10—1)=153 inde- 
in the table. We have 


17 scale values and tl 
r—2=8 interval widths, or a ny a a 


parameters. If the assumptions we have 
made are tenable, it should now be possible to 
teproduce the 153 empirical values from the 


25 parameters—within a specified margin of 
error. 


At the left of Table 2 we show the scale 
values of the stimuli upon the common 
psychological continuum. At the top of the 
table we have reproduced the psychological 


Table 3 


Distribution of Discrepancies Between Observed and 
Theoretical Values of Table 1 and Table 2 


Discrepancies f 
.06 2 
.05 1 
04 4 
.03 7 
02 12 
01 18 
.00 41 

=p 18 
—.02 16 
—.03 11 
—.04 4 
= 05 5 
— 06 4 
—.07 8 
—.08 0 
—.09 1 
—.10 1 


Scaling of Stimuli by Method of Successive Intervals 


continuum. If we now subtract the scale 
values of the stimuli from the cumulative 
interval widths, we shall have a matrix of 
theoretical normal deviates X's- The X' jx 
values will be the boundaries of the successive 
intervals as expressed in terms of normal 
deviates from the scale values projected upon 
the psychological continuum. The entries of 
this matrix for the first stimulus, Austrian, 
for example, would be —2.51, —2.13, —1.73, 
~1.31, —.90, —.45, .07, 85, and 1.89. These 
values would correspond to the upper limits 
of the intervals 1, 2, 3, 4, 5) 6 1, 8, and 9, 
respectively, on the psychological continuum 
for the first stimulus. From @ table of the 
normal probability curve, it is now possible to 
determine the corresponding proportion of 
judgments falling below each of the successive 


2.4 


1.6 


PAIRED COMPARISONS 
> 


0.0 1.2 


121 
intervals. These values are the cell entries 
of Table 2. 

The entries in each row of Table 2 are 
theoretical cumulative distributions. If the 
assumptions we have made are tenable, they 
should reproduce the empirical distributions of 
Table 1. If we make the matrix subtraction 
of Table 2 from Table 1, we shall have the 
discrepancies between our empirical and theo- 
retical values. The distribution of these 
errors is shown in Table 3. It can readily be 
determined that the absolute mean discrepancy 
is .021. This means that from our 25 para- 
meters we can reproduce the empirical distribu- 
tions of judgments with an average error of 
only .021. 

The mean discrepancy of .021 compares 
favorably with the values usually reported for 


24 mi 


guccESSIVE INTERVALS 


F 
18. 1. Scale values obtained by the me 


thod of paired comparisons and by the method of successive intervals, 


122 


the internal consistency check applied to 
paired comparison data. Guilford (3, p. 231), 
for example, reports an average error of .027, 
Hevner (4) an average error of .024, Thurstone 
(9) an average error of .029, and Saffir (5) a 
value of .031 for paired comparison data. 

In Figure 1 we have plotted the scale values 
obtained by the method of paired comparisons, 
as reported by Saffir, against those obtained 
here by the method of successive intervals. 
It is obvious that the relationship is linear and 
that the scatter is relatively small. 

We mentioned earlier that the distributions 
of judgments for five nationalities were not 
used in determining the psychological con- 
tinuum. These five nationalities were omitted 
for experimental reasons. We wanted to see 
if the scale values obtained by projecting the 
distributions of judgments for these five 
nationalities upon the psychological continuum 
would be consistent with the scale values of 
the other stimuli. The plotted points for 
these five nationalities are shown in Figure 1 
as small circles. It seems evident that their 
scale values are consistent with those obtained 
for the other 17 stimuli and with the corre- 
sponding values obtained by the method of 
paired comparisons. 


Summary 


The method of successive intervals can be 
applied to any number of stimuli. Only x 
judgments for 7 stimuli are required from each 
subject in contrast with the n(n—1)/2 judg- 


Allen L. Edwards 


ments required in the method of paired compari- 
sons. Yet the scale values obtained by the 
method of successive intervals are shown to be 
linearly related to those obtained by the method 
of paired comparisons. Furthermore, the meth- 
od of successive intervals, like the method of 
paired comparisons, provides its own internal 
consistency check. The average error 1n 
reproducing the empirical data from a limited 
number of parameters is shown to be compara- 
ble to the values reported for the method of 
paired comparisons. 


Received May 31, 1951. 


References 


1. Attneave, F. A method of graded dichotomies for 
the scaling of judgments. Psychol. Rev., 1949, 
56, 334-340. 

2. Edwards, A. L. Psychological scaling by means of 
successive intervals. Psychometric Laboratory 
Report No. 69, May, 1951. Univ. Chicago. 

3. Guilford, J. P. Psychometric methods. New York: 

McGraw-Hill, 1936, p. 231. 

- Hevner, Kate. An empirical study of three psycho- 
physical methods. J. gen. Psychol., 1930, 4, 
191-212. 

. Saffir, M. A. A comparative study of scales con- 
structed by three psychophysical methods. Psy- 
chometrika, 1937, 2, 179-198. 

6. Thurstone, L. L. Psychophysical analysis. Amer. 

J. Psychol., 1927, 38, 368-389, 

rA Thurstone, L. L. A law of comparative judgment. 
Psychol. Rev., 1927, 34, 273-286. 

. Thurstone, L. L., and Chave, E. J. The measure- 

ment of altitude. Chicago: Univ. Chicago Press, 
1929. 


- Thurstone, L. L. Unpublished study of food pref- 
erences, 


n 


Paired Comparison Ratings. 


I. The Effect on Ratings of 


Reductions in the Number of Pairs 
Ernest J. McCormick 


Occupational Research 


Center, Purdue University 


and 
John A. Bachus 


The Kroger Company, Cincinnati, Ohio 


There has been rather general agreement 
that the paired comparison system is a 
Satisfactorily reliable method of obtaining 
relative judgments in various situations, 
Including employee rating. Its limited use m 
employee rating (as well as in other situations), 

owever, probably in large part is attributable 
to the fact that it is time consuming and is 
fatiguing to the judges if there are very many 
individuals (or other stimuli) to be judged.* 

It was the purpose of this investigation to 
determine the extent to which it would be 
Possible, in paired comparison ratings of 
employees, to use reduced numbers of pairings 
and still achieve essentially the same rating 
Tesults as would be obtained from a complete 
Pairing of all individuals within the group. 


Experimental Procedure 


Employees Rated. Through the cooperation 
ofa manufacturing company two independent 
Stoups of 50 employees each were rated by 

cir respective foremen. The individuals in 
~toup I, consisting entirely of women, worked 
in the assembly department and were engaged 
M the task of assembling the small parts of 
Clectric meters. The individuals in Group 
in Consisting of 48 women and 2 men, woed 
in kas machine department and were eigas. 
in o ming and finishing small parts to be use 
n electric meters. 
ws reparation of Pairs for Rating. A complete 
eat of each of the individuals 10 each 
foot with every other individual results in 

225 pairs. An IBM card was punched for 
N ' The number of pairs increases greatly with increasing 
ile © total number of pairs, where oa stim j 
Paired with every other one, is a where N 


is t 
he number of stimuli. 


each of the 50 individuals in each group; by 
other special machine methods cards were 
prepared for all 1,225 pairs for each group.? 
Through mechanical methods the names of the 
two individuals in each pair were printed on 
the top edge of the card. Each person’s name 
appeared on the right side and on the left 
side of the cards respectively in about half of 
the pairs. 

Random numbers were also reproduced into 
the cards, and the cards for each group were 
then “sorted?” by machine into random order 
before presenting them to the foremen who 
were to serve as raters. 

Rating of Employees. The cards were then 
presented to the foremen with typewritten 
rating instructions. These instructions pro- 
vided that the employees in each pair be 
judged in terms of the following question: 
“Which of these two employees is doing her 
(his) present job better?” For each pair, the 
foreman was asked to place a check maik be- 
side the name of the employee whom he judged 
to be the better. The foreman of each gro 
rated only the members of his respective £6 “= 

Performance Rating Indexes. On the basis of 
the judgments made by the raters, scale val i 
were determined for all employees in ad 
group. For this purpose performance ata 
indexes provided with the Personnel Com i 
son System were used. This ea 
rating index is determined on the basis of aa 


2 Appreciation is expressed ti 
ri x o Dr. N. C. K 
Sie emer dee sales Purdue uren or 
L ping the procedures fi 
aration aud subsequent processing of the TEM wae” 
en Scions Nrombarison System, developed by 
system based on the pene pe atine 
i on me 
oy ee ae System is available a = 
pany, 15 East 8th Street, Cincinnati Ohio. = 
: i 


123 


124 


EMPLOYEE NUMBER 
12 3.4 5 6 7 8 9 10 ii 12 13 14 IS 


EMPLOYEE NUMBER 


Fic. 1. Illustration of matrix of pattern of partial 
pairing; “x” identifies paired individuals. 


total number of individuals paired, and of the 
number of times an individual was chosen by 
the rater over other individuals. More specif- 
ically, a rating is based on the proportion of 
times an individual is preferred, converted 
to standard scores. The scale values tend 
toward a normal distribution and provide for 
a mean of 50 and standard deviation of 10. 
The rating indexes actually range from a low of 
23 to a high of 77. 

Patterns for Pairing Individuals. Through 
an empirical approach, various “patterns” 
were developed for the “partial” pairing of 
each individual (i.e., an employee is paired with 
fewer than all the other employees in the 
group). These patterns, used for the partial 
pairing of the original groups of 50, provided 
for the pairing of each individual in the group 
with various numbers of other individuals, as 
follows (the letter identifies the pattern, and 
the number given is the number of pairs per 
individual for the pattern): A-40; B-35; C-32: 
D-28; E-25; F-21; G-17; H-13; 1-9; and J7. 

Four patterns were also used in the pairing 
of two groups of 30 individuals who had been 
randomly extracted from the original groups of 
50. Three of these patterns (A, E, and H) 
were patterns which had been used with the 
groups of 50, and which were also applicable 
to groups of 30. The other pattern (K) was 
specifically developed for use with the 


i de groups 
of 30 individuals. These patterns provi 


ded for 


Ernest J. McCormick and John A. Bachus 


the following numbers of pairs for each of the 
30 individuals: A-24; E-15; K-12; H-8. 

The Character of the Patterns. These pat- 
terns, when worked into a triangular matrix, 
indicate. which individuals shall be paired for 
rating. Such a matrix indicates the identifica- 
tion numbers of the employees to be paired, 
and is of course based on the assumption that 


the identification numbers have been assigned , 


to the individuals in a random manner in so 
far as skill on the present job is concerned. 
Figure 1 shows, as an illustration, a completed 
matrix (Pattern D) for a group of 15 employees 
when each is paired with 8 other employees. 
An “x” at the intersection of any column and 
row indicates that the two employees repre- 
sented are to be paired when using this pattern. 

Any given pattern is suitable for use with 
certain N’s, but not for use with other N’s. 
A pattern is suitable for use with a given V 
if it results in all individuals being paired with 
an equal number of other individuals. For any 
particular pattern, then, the combination of 
pairs resulting for a given V will determine 
whether or not each of the N individuals is 
paired with an equal number of other individ- 
uals; this in turn will determine whether the 
pattern is or is not suitable for the NV in 
question. 

The N’s with which a particular pattern 
can be thus used, however, increase in multiples 
of a constant for that pattern; for an N which 
coincides with any such multiple there will 
result an equal number of pairs per individual. 
For any given pattern, this increase in multi- 
ples of a constant can be thought of as @ 


z 
B3 
A, 
E 
Bq 
Si 
s% 
Z 
ae 
EE 
a8 
Ja 
Be 


Fic. 2. Segments of first columns of patterns (under- 
scoring shows beginning and ending of rhythm”). 


Paired Comparison Ratings 


“rhythm” for the pattern. Starting out with 
the smallest W for which a pattern results in 
equal pairs per individual, it is possible to 
determine empirically the next greatest N 
for which there will also be equal pairs per 
individual. The difference between these two 
N’s is the size of the rhythm (in terms of 
individuals) for the pattern. The extension 
of the pattern to N’s that are increased by 
multiples of this constant will result in an 
equal number of pairs per individual for any 
such V, i 

In Figure 1, for example, the rhythm is 
complete at each of the broken lines. As 
Presented, this pattern would be suitable 7 
an WV of 8 (with each person paired with 
others); an extension of the pattern to 15 a 
results in equal pairing (8 pairs per individua ). 
The difference of seven (15—8) is the size 0 
the thythm for the pattern. This pattern 
Would therefore be suitable for larger N’s 
Which increase in multiples of seven, such as 
22, 29, 36, 43, 50, etc. This pattern, pigs 
to accommodate 50 individuals, was one © 
those used in the investigation. 

Figure 2 characterizes the several pitar 
used in the investigation, The heading 5a od 
figure shows, for each pattern, the size 0. ‘ 
thythm (in numbers of individuals), and the 
number of pairs restilting from the dag 
where it was used with the Vs of 50 or the V’s 
of 30, respectively. The body of the figure 
shows a segment of the first column for eac 
of the patterns; the column for a given i 
Identifies (by “x”) the individuals with whom 
employee number one is paired for that ee 

€ underscoring shows the points at whic = 
thythm of each pattern is complete; an CY 
Sion downward in the column et oe 
thythm through the remaining individus! : 
Would then give a complete first column for ad 
total of 50 or 30 employees depending on T 
8roup or groups with which the pattern W i 
used. For any given pattern, then, knowingn 
of the individuals with whom employee number 
one is to be paired (column 1), it is only aie a 
Sary to complete the triangular matrix by filling 
™ the diagonals down toward the right, 
Shown in Figure 1, to identify all of the pai" 
that pattern. ` 

ethod of Deriving Rating Indexes for 
arious Patterns. It should be mentione 


125 


that in using these patterns the foremen were 
not required to re-rate the members of their 
groups. The cards containing the pairs re- 
quired for a given pattern were extracted from 
all the cards used in the initial complete pairing 
of each original group of 50. By this procedure 
it was then possible to make an independent 
tally, for each pattern, of the number of times 
each employee was preferred over the others 
with whom he was paired in the pattern in 
question. For each pattern, performance 
rating indexes were then obtained for all 
employees in the group in essentially the same 
manner used in obtaining rating indexes based 
on all possible pairs. One modification of the 
procedure was necessary, however, in using 
the performance rating index table to derive 
rating indexes resulting from the various 
patterns of partial pairings; for each pattern, 
instead of entering the table for an W of 50 
(or 30 in the case of the smaller groups), the 
“WV” for a given pattern was considered to be 
the number of pairs per individual, for that 
particular pattern, plus one. 


Results 


The rating indexes obtained with each 
pattern for the employees of Group I and of 
Group II were correlated with the rating 
indexes obtained from the complete pairing. 
Similar correlations were computed for the 
smaller groups, Group III and Group IV. 
The resulting correlations are given in Table 1. 

The differences in the two correlations for 
each pattern were then subjected to tests of 
statistical significance. Such tests were made 
in order to ascertain whether differences in the 
two correlations could or could not reasonably 
be attributed to chance fluctuations, In 
making such tests the correlation coefficients 
for both groups were converted to Fisher’s z 
coefficients. Tor each pattern the difference 
between the z’s for the two groups was then 
determined. This difference was in turn 
divided by the standard error of the difference 
between the two coefficients, using the formula 
provided by Guilford (1, p. 224). The result- 
ing t ratios are presented in Table 1. 

Tt will be observed that none of the t ratios 
even approaches the 5 per cent confidence 
limits (1.96). Since none of the pairs of 7’s 
differ significantly, it may be inferred that the 


126 


Table 1 


Correlations, for Two Independent Samples, of Scale 
Values Resulting from Various Patterns of 
Partial Pairings with Those Resulting 
from Complete Pairing 


Correlations 


for Two Groups 
Pairs Groups of 50 
Per Total ——_—— 

Pat- Indi- No. of Group Group t 
tern vidual Pairs $ II Ratio 
A 40 1,000 .991 .994 .97 
B 35 875 .992 .992 -00 
G 32 800 -993 987 1.65 
D 28 700 980 984 53 
E 25 625 961 971 3 
F 21 525 960 -948 68 
G 17 425 .962 .949 73 
H 13 325 935 -928 24 
I 9 225 936 885 1.50 
J 7 175 .858 888 .63 

Groups of 30 
Group Group 
Ii IV 
A 24 360 996 994 51 
E 15 225 991 979 1.62 
K 12 180 -961 946 62 
H 8 120 948 -898 1,29 


magnitudes of the various 7’s cannot reasonably 
be attributed to chance fluctuations, and that 
they therefore presumably reflect the approx- 
imate degree to which ratings resulting from 
the respective patterns of partial pairings 
actually reproduce the ratings based on a 
complete pairing. 

Ratings from Complete Versus Partial Pair- 
ings for Groups of Fifty. It will be observed in 
Table 1 that the correlations for the two groups 
of 50 ranged from .991 and .994 for pattern A 
to .858 and .888 for pattern J. The decline in 
the correlations is rather consistent with 
reductions in the number of pairs per individ- 
ual, except that patterns I and J, which are 
based on 9 and 7 pairs per person, respectively, 
show more marked decline, and greater 
differences between the two groups, than do 
the other patterns. In general, it appears 
that reductions in the number of pairs per 
individual to 21 (pattern F) or to 17 (pattern 
G) apparently can be made with only limited 


Ernest J. McCormick and John A. Bachus 


effect on the resulting ratings; these patterns 
give correlations in the neighborhood of .95 
and .96. , 

Ratings from Complete Versus Partial Pair- 
ings for Groups of Thirty. The correlations 
for the patterns used with the two groups of 
30 ranged from .996 and .994 (pattern A) to 
.948 and .898 (pattern H). Reductions to 
about 12 pairs per individual (pattern K) 
appear to be feasible without affecting materi- 
ally the resulting ratings. 

Ratings Resulting from Random Halves. AS 
a supplementary type of analysis, the two 
groups of 50 were split into halves by selecting 
at random the numbers of the employees to 
go into each half. This gave two halves of 
25 each for Group I and for Group II. The 
individuals within each half were then paired 
completely (i.e., each individual was paired 
with each of the other 24 individuals in the 
same half) and performance rating indexes 
were obtained. For Group I and for Group 
II independently, the performance rating 1n- 
dexes obtained for all individuals (those from 
both random halves) under these conditions 
were then correlated with the indexes ob- 
tained from the original complete pairing: 
The correlations obtained were .974 and .955 
for Groups I and II, respectively. These 
correlations are of essentially the same order 
as those obtained for pattern E in which 
each individual is paired with 25 others. 
This would seem to indicate that with an 4 
of approximately 50, a splitting of the group 
into random halves and pairing each half 
completely will give relatively the same ratings 
as when the original total group is paired 
completely. It also suggests that relatively 
the same results can be obtained when pairings 
are made within the random halves as when & 
pattern of partial pairings is used which 
Provides for each individual to be paired with 


approximately half of the others in the total 
group. 


Summary and Conclusions 


Two groups of 50 industrial employees were 
rated independently by their respective fore- 
men using the method of paired comparison; 
all possible pairs of employees were rated. 
performance rating index was obtained for 
each individual of each group using an index 


Paired Comparison Ratings 127 


table that is provided with the Personnel 


Comparison System. . 
A series of systematic patterns of partial 
pairings were developed for experimental use; 
each such pattern provided for each individual 
to be paired with a specific number of other 
individuals. Ten patterns were developed 
Which provided, respectively, for each person 
to be paired with the following numbers of 
others in the group: 40, 35, 32, 28, 25, 21, 17, 
13, 9, 7. The total numbers of pairs for the 
various patterns ranged from 1,000 to 175; a 
complete pairing results in 1,225 pairs. 

Performance rating indexes were computed 
from the ratings made on the pairs included 
in each pattern. These indexes were then cor- 
related with those derived from the complete 
Pairing. The range of these correlations was 
from .994 to .858. Correlations of the order 
of approximately .93, for example, were 
obtained with a pattern which reduced the 
total number of pairs from 1,225 to 325. 

Four patterns were also used with two groups 
of 30 individuals extracted randomly from 
the two original groups of 50. The head 
Tesulting from these patterns were ine re 
With the ratings resulting from a yee at 
Pairing of each of the 30 individuals with a 
of the others. These correlations ranged from 
996 to .808. 

_lt should be kept in 1 
Stents of correlation ape hi 
Partial pairings will be aftected, | 
by the fa that the partial pairings are i 
included in the complete set of painaen yore 
'S a certain parallel in this situation wit A n 
în which part scores of a test are correlated Pi 
total scores. These correlations, meor, 
Should be interpreted as indexes of the E ' 
'0 which various patterns of partial oe 
can produce ratings which will reproduce a 
ratings from a complete pairing. These ¢ 


mind that the coeffi- 
tings based on 
statistically, 


relations therefore cannot be considered as 
being specifically indicative of the reliability 
of the various patterns. The reliability of 
such a pattern would be largely a function of 
the extent to which different “samplings” of 
rating judgments based on that pattern would 
produce consistently the same rating results. 
The reliability of ratings based on partial 
pairings has been investigated in an associated 
study (2). 

On the basis of the results of the experiment 
the following conclusions seem warranted 
when using the paired comparison system for 
rating employees in groups of approximately 
the sizes of those investigated: 


1. Ratings obtained from partial pairings 
result in fairly high correlations with ratings 
based on complete pairings; the correlations 
are reduced rather systematically with reduc- 
tions in the numbers of pairs per individua 
on which the ratings are based. 

2. Rather substantial reductions can’ be 
made in the numbers of pairs per individua 
with only limited reductions in the extent to 
which the resulting ratings differ from those 
obtained with complete pairings. 

3. The potential reduction in the tota 
number of pairs to be rated with large groups 
can reasonably be expected to make the paired 
comparison system more practical for use in 
employee rating and for other purposes. 


Received May 28, 1951. 


References 


1. Guilford, J. P. Fundamental statistics in psychology 
and education. New York: McGraw-Hill Book 
Co., Inc., 1950. 

2. McCormick, E. J., and Roberts, W. K. Paired 
comparison ratings. II. The reliability of ratings 


based on partial pairings. J. appl. Psychol 
1952, 36, in press. ppl. Psychol. 


Dial Reading Performance as a Function of Brightness ' 


S. D. S. Spragg and M. L. Rock ° 


University of Rochester 


Instrument dials must often be read rapidly 
and accurately under conditions in which it 
is desirable to provide no more than the 
minimum amount of illumination necessary for 
the efficient performance of a task. Such 
conditions are found for example in the airplane 
cockpit during night flying. It has seemed 
desirable in the night operation of military 
aircraft and perhaps somewhat less for com- 
mercial aircraft to attain and preserve as much 
dark adaptation on the part of the pilot an 
co-pilot as is feasible. : 

This demand has posed the persistent 
‘problem of the amount and nature of illumina- 
tion which will best meet the requirements of 
such a situation. A practical solution to the 
problem will obviously be a compromise, but 
it should be based on a determination of the 
effectiveness of visual performance under a 
range of intensities and spectral distributions 
of illumination. From this, one should be 
able to specify the amount and spectral distri- 
bution of illumination which will: (a) permit 
satisfactory performance of visual perceptual 
tasks inside the cockpit (reading dials, etc.); 
and (b) maintain a level of dark adaptation 
sufficient for the pilot and co-pilot to deal 
adequately with visual stimuli coming from 
outside the cockpit. 

As a beginning in a series of studies designed 
to contribute toward the solution of the problem 
experiments have been undertaken attempting 
to relate visual performance (as indicated by 
the speed and accuracy of reading dials) to the 
intensity and to the spectral distribution of the 
illumination provided. 

The present report concerns itself with dial 
reading performance as a function of illumina- 

1 The experiments reported here were conducted as 
pa ae etae rear on human factors related 


carried 
contract (W33-038 ac18317) b out on a research 


o etween th 5. 5 
Rochester and the Air Materiel Soma oe ee 
Forces. They have been reported in the following tock: 


nical reports to the Aero Medical Laborat $ 
Materiel Command: MCREXD~694-21 and TH goat 

?M. L. R. is now associated with E, N, Hay Asso- 
ciates, Philadelphia. 2 iz 


tion intensity. Subsequent reports will de- 
scribe comparable experiments using a range 
of colored filters to modify the spectral 
distribution of illumination as well as studies 
of the adequacy of flying a Link Trainer (in 
a task in which the cues are almost completely 
visual) as a function of the above variables. 
Although dial reading is a complex percep- 
tual task rather than a simple acuity function, 
available information on the relationship 
between acuity and illumination is relevant 1n 
that it may suggest the general nature of the 


. function as well as set a lower limit to perform- 


ance. The early studies of König as reported 
in (15, p. 86) as well as other more recent 
studies have indicated that acuity varies as 
the logarithm of illumination intensity, with 
the implication that even at high illuminations 
an increase in illumination will produce some 
increment of acuity. 

Other workers, however, have reported that 
visual acuity increases with illumination incre- 
ments only up to a relatively modest level 
(such as 5 to 10 or 20 foot-candles) and that 
the increase is hardly noticeable beyond this 
range. Carmichael and Dearborn (2) after 
reviewing the relevant acuity and reading 
studies chose an illumination intensity of 16 
foot-candles for their reading experiments, 
considering this value to represent an optimu™ 
level in view of the available evidence. 

A number of recent studies, both military 
and civilian, have concerned themselves with 
factors determining acuity and other character- 
istics of visual performance as a function © 
illumination level in a variety of task situa- 
tions. This literature has been surveyed, with 
differing emphases, by Fulton and his C0- 
workers (5, 6, 7), Lawrence and Macmillan 
(10), Smith and Kappauf (12) as well as others- 
A resumé of that literature will not be under- 
taken here. There still remains, howevél, 
need for information relating visual perceptua 
tasks (such as dial reading) toa systematically 
varied range of illumination values. Such is 
the aim of the present study. 


128 


| 


Dial Reading Performance as Function of Brightness 129 


Method 


Two experiments were performed (I and II). 
Except where otherwise indicated the state- 
ments in this section apply to both experiments. 

Apparatus. The general plan of the appara- 
tus followed that employed by Kappauf and 
his co-workers (8, 9) in their studies of dial 
designs and legibility. ‘The subject was seated 
in a three-sided booth, approximately 4X4 
feet, facing the middle wall. In this wall was 
an 11 X 14 inch aperture in which the sample 
dial and the cards containing banks of stimulus 
dials were presented. The center of the 
aperture (and of a bank of dials) was 28 inches 
from the subject’s eyes and 15° below his 

orizontal line of regard. The carrier for the 
dial cards was correspondingly tilted 15° so 
that the surface of the card was normal to the 
subject’s line of regard when directed at the 
center of the bank of dials. An adjustable 
head-rest, mounted on a horizontal _bar, 
Served to keep the subject’s head in a satisfac- 
torily constant and comfortable position. 

The carrier for the dial cards slid in hori- 
zontally placed brass tracks. It was double 
(i.e., 11 X 28 inches) so that as one card (e-8» 
a sample dial) was slid out of the subject’s io , 
another card (a bank of dials) came mE 
lately into view. Micro-switches at each ent 
Of the track were arranged so that the illumi- 
nation on the dial cards went off as the carrier 
Was moved from one position and came on 2 
ìt reached the other position. In this way the 
Shift from one card to another was accom- 
Plished rapidly in a short interval of datnes 
and did not require the subject to make any 
major shift in visual orientation. Thus F 
Subject was kept steadily at the chosen level o 
illumination throughout a series of readings, 
except for an instant of darkness between the 
Presentation of each stimulus card. il 

he experimenter was seated at @ Ps 
Work table placed against the outside of t 
middle wall of the booth. The card carrie 
Was in front of him within easy reach and to 
'S side was a bin containing the supply © 
Stimulus cards. 
airs of Mazda lamps served as light a 
th €y were mounted on the horizontal Y 
_ tt carried the subject’s head-rest, about h 
ches on each side of the head-rest. For the 


four lowest intensities two Air Force cockpit 
lamps, type C-4, were used; for the 6 foot- 
lambert intensity 115v. 25w. Mazda lamps in 
cans were employed. The lamps were care- 
fully adjusted so that the stimulus cards were 
evenly illuminated. 

Voltage was maintained at a constant level 
by means of a Variac, Model V-5MT, and 
a monitoring Weston AC voltmeter, Model 
433, on the experimenter’s desk. The lamps 
were operated at less than rated voltage, 
41v. in the case of the cockpit lamps (wired in 
series) and at 93v. for the two 25w. Mazda 
lamps, in order to increase their stability. 
The color temperature of both was in the 
neighborhood of 2400° K. 

Chosen levels of illumination were achieved 
by means of accurately drilled apertures in 
removable brass plates. All light sources had 
two ground-glass surfaces in the optical 
pathway to achieve high dispersion. 

Materials. The stimulus materials consisted 
of high-contrast, white on black photographic 
reproductions of dial setting. Each stimulus 
card showed 12 dials—three rows of four dials 
each. The dials used in Experiment I were 
2.8 inches in diameter, constructed according ° 
to Air Force specifications, but with a scale of 
100 units divided by numbers and scale marks 
at every 10 units. Figure 1 shows a represen- 
tative bank of dials. Sample dials were 
identical to the stimulus dials except that they 
lacked a pointer. The dials for Experiment 
II were the same as those for Experiment I 
except for two changes: they were 1.4 inches 
in diameter, and had scale marks for every unit 
on the 100 unit scale. These dials, chosen 
from a wide variety developed by Dr. William 
Kappauf and his associates at Princeton 
University, were selected because they had 
been shown to constitute a relatively difficult 
perceptual task with a fairly high proportion 
of errors. Details of the construction of the 
dial cards and some experimental results have 
been reported (9). 

For Experiment I five of the stimulus cards 
(each containing 12 dials) were selected. Each 
was cut vertically into equal halves thus 
yielding ten cards, each containing three rows 


$ This project is grateful to Dr. Ka i 
S pr i e . Kappauf for hi 
generosity in making available these ais N 
and for his many valuable suggestions. 4 


130 


S. D. S. Spragg and M. L. Rock 


Fic. 1. A representative bank of dials, 2.8 diameter, 100 X 10 scale. 


of two dials each. These were mounted on 
masonite board. For a given reading any 
- two of the cards were selected and placed 
together to form the left and right halves of 
a full 12-dial stimulus card. This procedure 
made available a large number of combinations 
of half-cards, and reduced the chances for 
distortion of results due to remembering 
certain recurring combinations or sequences. 
A given stimulus combination (12 dials) thus 
appeared only once during the course of the 
experiment, even though each half-card (6 
dials) appeared 6 times, 3 on the right and 3 
on the left, during the course of the training 
and formal trials. A counter-balanced se- 
quence was also employed so that the appear- 
ance of the half-cards was distributed through- 
out the course of the readings. i 
Data sheets were prepared in advance for 
each stimulus combination and for each ex- 


perimental sequence used. These indicated 
the correct settings, with adjoining spaces for 
recording the subject’s responses and provision 
for recording time, total errors, and other 
relevant information. 

Subjects. Twenty male subjects served in 
Experiment I. All were students at the 


University of Rochester (5 graduates, 15 


undergraduates) and were in their late teens 
or twentiesin age. Subjects chosen were those 
who passed a rigorous visual screening,‘ using 
the Keystone Telebinocular, All subjects 
had: normal ophthalmoscopy; 20/20 visual 
acuity, monocularly and binocularly, at dist- 
ance and near, without glasses; 80% or better 
stereopsis; no vertical imbalance; less than 6 
prism diopters physiological exophoria; Jess 
than 6 prism diopters exophoria at distance 


and normal color vision. Subjects were pai“ 
for their services. 


Procedures. Each subject in Experiment r 
was allowed to become cone dark adapte? 
(approximately 10 minutes) before the illum! 
nation was turned on. He was then shown us 
the aperture a sample dial under the illumina 
tion to be used first. The instructions calle 

his attention to the dial and its scale and co?” 
tinued as follows: “When I say ‘Ready’ the 
lights will go off and in a moment they W! 

come on again. When they come on, you 4 

to read the settings on the dials, reading from? 
left to right, first the top row, then the secome? 
then the third. Read the dials to the neare’ 


unit, such as 61, 38, 43, etc. Read the dials a 
rapidly and as accurately as you can.” e 
On each trial a “re z iH 


i a ady” signal was given, © 
lights went off briefly as the card to be read Wa? 


“By one of us (M. L. R.), a graduate optometrist. 


Dial Reading Performance as Function of Brightness 131 


slid into position, and then came on showing 
the card of 12 dials in position. 

Six cards (72 dials) were read before formal 
trials were begun; this fore-test served to reduce 
practice effects during the experiment. 

On the formal trials each subject read 10 
cards of dials at each of 5 brightness levels. 
Subject’s responses were recorded as described 
earlier. Time was recorded by the experi- 
menter's starting a Standard Electric Timer as 
the subject read the first dial and stopping it 
as he read the eleventh dial. The first an 
ast dial readings in each card were eliminate 
from both the time and error data because © 
their relative unreliability. Evidence in sup- 
Port of this procedure has been reported else- 
where (9, pp. 37-38). Thus the data for each 
Subject consist of 100 dials read at each of five 
brightness levels. 

The five brightness leve 
ment I were chosen as a resu 
€xperimentation which indicate 
Sharp change in the difficulty of the dial- 
reading task occurs at a brightness of about .02 
Cot-lamberts.® For this experiment, therefore, 
two values were chosen which would bracket 
the suggested transition level, a third value was 
Chosen to be slightly above cone threshold for 


the cone z ji fourth at 6 foot- 
dark-adapted eye, & Á 
ae yauf, Smith, and 


amberts (the level which Kappan! 5 
Bray (9) et and a fifth at an intermediate 
evel. The values selected were: 0.005, 0.018, 
9.022, 0.296, and 6.0 foot-lamberts. eepe 
Brightness measurements were made with a 
Macbeth illuminometer used in the subject s 
Position, and directed against an 11 X 14 inch 
Sheet of unexposed but fixed photographic 
Paper from the same stock as that of the dia 
reproductions. Thus its “whiteness (reflect- 
ance) was equivalent to that of the white 
Markings—numbers, pointer, scale markings— 
of the dials used. The contrast between white 
and black stead on the dials was somewhat 


Sreater than 10 to 1. 


A considerable number © ; 
at each level was made by each of the two 


Writers and the accepted value in each case was 
aken as the average of the two observers 
median reading. Agreement was close, being 
Within 5-10% for all levels. 

ince five levels of illumina 
ied sequences of brightness 
Yed to balance practice and 


ls used in Experi- 
It of exploratory 
d that a rather 


{ brightness readings 


tion were used 
levels were em- 
] fatigue effects. 


var 
Plo 


6 

ux Gk foot-lambert is a measure of the d 
Surface. pene) yah is refectio fron a 
7 i erhaps more L x 
nanco An measure af the density of light flux (illumi 
B (int falling upon a surface. The relations 
Where qp 4a ™berts) and E (in foot-candles) is: B = S ia 
questios $ 2 Value for the reflectance of the surfa 

ton (e.g., 40%, 75%, etc.). 


The only restriction imposed was that a series 
of readings at the brightest level should never 
be immediately followed by a series at the 
dimmest level. As a further precaution, in 
changing from one brightness level to another 
the subject was given from 5-10 minutes for 
adaptation, with the light at the new level 
illuminating the sample dial. 

Each subject was tested at two sessions, 
several days apart. At the first session sub- 
jects were given the visual screening tests, the 
practice trials, and tests at the first two bright- 
ness levels to be used for that subject. At the 
second session some further informal practice 
was given, then tests on the remaining three 
brightness levels. 

Subjects were given no knowledge of results, 
i.e., they were not told the correct readings nor 
whether their readings were correct or wrong. 


Results 


Experimeni I. The data of this experiment 
consist of error scores and time scores made 
by the group of 20 subjects, under the 5 levels 
of illumination employed. 

The principal analysis of errors is in terms of 
error frequency, i.e., the number of readings in 
error without regard to the magnitude of error. 
Thus an error of one unit has equal status with 
an error of four or ten units in such an analysis. 
Table 1 summarizes the mean error frequencies 
for the five brightness levels used. Each mean 
is based on 100 dials read by each of 20 
subjects. 

Data on speed of dial reading consist of times 
required to read the middle 10 of each card of 
12 dials. They are summarized as mean 
reading time per dial in Table 1. Each mean 


Table 1 


Dial Reading Performance at Five Brightn 
k ess Li 
2.8 inch, 100 X 10 Dials (N io 20) sai 


Mean 

@na ta 

and %) of / 
Bright- Readings Reading 
ness,in inError Standard Ti 
Foot- in Reading Devin Dial is a 

Lamberts 100 Dials tion Seconds tion’ 

0.005 67.3 10.0 2.84 93 
0.018 59.9 14.1 2.64 74 
0.022 30.1 8.1 1.52 21 
0.296 27.8 5.5 1.33 oy 
6.0 27.8 44 


1.30 22 


132 


100 


80 


604] 


40 


PERCENT READINGS IN ERROR 


2.8" DIALS 
20 
1.4" DIALS 
o 7 
-3 -2 -i o i 2 
Loc I, IN FOOT-LAMBERTS 


Fic. 2. Frequency of errors in reading 2.8 inch, 
100 X 10 dials and 1.4 inch, 100 X 1 dials as a function 
of brightness. 


is based. on 2000 readings (100 dials read by 
each of 20 subjects).® 

The error frequency data are summarized in 
Figure 2 (results for 2.8 inch dials) and the 
reading time data in Figure 3 (2.8 inch dials). 
These two figures are seen to be highly similar.7 
Both indicate that in this visual task there is 
marked improvement with illumination in- 
crease up to approximately 0.02 foot-lamberts 
and relatively little improvement thereafter 
at least up to 6.0 foot-lamberts. We have 
made informal observations indicating no 
significant improvement at levels considerably 
higher than this. 

Since our principal concern was with dial 
reading performance as a function of brightness 
level statistical analysis consisted primarily of 
t tests comparing performance between the 
several brightness levels, both for errors and 
for time. These are summarized in Table 2. 
From this table it is seen that all the differences 
and only the differences which cross the 0.02 
foot-lambert value are significant at the 1% 
level. Except for one instance no difference 
that does not cross the 0.02 foot-lambert value 


è Detailed results for errors and times have been pre- 
sented in the original technical reports of these experi- 
ments (13, 14). 

7In Figures 2 and 3 the data 
against the logarithm of brightness 
plot would involve a very lengthy 
of all four of the curves would appear very nearly 
vertical. The logarithm values for the brightnesses 
used (other than the obvious ones) are shown as follows 
in the parentheses: 6.0 (0.778); 0.296 (1.471); 0.022 
(2.342); 0.018 (2.255); 0.005 (3.699). 


have been plotted 
since an arithmetic 
scale and the slopes 


S. D. S. Spragg and M. L. Rock 


Table 2 


Values of t, Comparing Dial Reading Performance at 
Five Brightness Levels, 2.8 inch, 
100 X 10 Dials 


Brightness, in Foot-Lamberts 


0.005 0.018 0.022 0.296 6.0 
ts between Error Means 
0.005 = 
0.018 2.70* — 
0.022 12.89**  14.39** — 
0.296 16.61 13,62" 1.75 = 
6.0 15:70** 8.84** 1.58 .00 = 
?s between Time Means 
0.005 — 
0.018 1.75 — 
0.022 6.82** 6.60** = 
0.296 7.60** 8.05** 1.90 — 
6.0 9.40** 9.86** = 1.94 1.68 a 


* Significant at 5% level. 
** Significant at 1% level. 


is significant at even the 5% level. It seems 
clear that in terms of speed as well as accuracy 
there is a highly significant improvement 1? 
dial reading performance when the brightness 
level is increased from values below 0.02 foot- 
lamberts to measured values above 0.02 foot- 
lamberts, and that further increases up to 
6.0 foot-lamberts bring little or no increment in 
performance. 

The distribution of errors with respect t° 
magnitude of error is summarized in Table 4- 


4.0 


1.4" Dials 


2.8" DIALS 


MEAN TIME PER DIAL, IN SECONDS 
N 
a 


a ae 
Los 1, 


sj ó i 2 
IN FOOT - LAMBERTS 


Fic. 3. Mean time required to read 2.8 inch 


100 X 10 dials and 1.4 inch, 100 X 1 dials as a functi?” 
of brightness. 


Dial Reading Performance as Function of Brightness 


Although at all brightness levels errors of 1 
scale unit are in the majority, it is clear that 
the distribution of errors is markedly different 
above and below 0.02 foot-lamberts. Above 
this value errors of 1 and 2 scale units account 
for 95% to 96% of all errors made. For the 
two brightness levels below 0.02 foot-lamberts, 
however, errors of 1 and 2 scale units account 
for only about 75% of the errors and errors 
of greater magnitude are much more frequent. 

The large magnitude errors (10 scale units 
and over) at the two lower brightness levels 
Were mostly errors of 48 to 50 scale units. 
At these levels subjects were at times uncertain 
as to which was the pointer end and which the 
Teverse end of the indicator. At the two 
OWest brightness levels about 1 reading in 16 
Was a reversal (error of approximately 180°). 
At levels above 0.02 foot-lamberts this type of 
crror was completely absent in 6000 dial 
Teadings, , 

Analysis of possible practice effects im this 
experiment was made by pooling the data for 
first br ightness level tested, second level tested, 
ete. Since each brightness level appeared in 
ach ordinal position an equal number of times, 
no advantage due to sequence is present for 


any brightness level. The results, both for 
ime and for error scores, show no evidence o 
whole. 


à practice effect for the experiment as a 

n fact there was some decrement in perform- 
ance on each of the two days as testing 
Continued. ‘This would suggest that motiva- 
‘Onal and fatigue factors may have pen 
™portant here than practice effects. Subjects 


Table 3 


The Proportion of Errors Occu 
Magnitudes, for Each Brig! 


rring at Differing 
htness Level 


d x 
Magnitude ot in Foot-Lamberts 


“rror, i Brightness, 
Scale Units 0,005 0.018 0.022 0.296 60 
1 54% 58% 85% 90% ve 
2 go 19 10 ; ; 
3 to9 19. if | 2 
10 ang ace 9 8 2 3 3 
T 
Tol percent 100 100 100 100 100 
a number of a 
Cadings in error 1346 1198 602 55 5 


133 


Table + 


Dial Reading Performance at Five Brightness Levels, 
1.4 inch, 100 X 1 Dials (N = 10) 


Mean 
Number of Mean 

Bright- Readings Reading 

ness, in in Error Standard Timeper Standard 
Foot- in Reading Devia- Dial, in Devia- 

Lamberts 50 Dials tion Seconds tion 

0.005 31.3 8.0 3.45 1.33 
0.01 20.8 7.8 2.79 0.66 
0.05 57 4.0 1.77 0.21 
0.1 3.9 2.8 1.71 0.24 
1.0 32 2.1 1.55 0.21 


comments indicated that the degree of concen- 
tration required made this an arduous task. 

Experiment II. The findings reported above 
were based on fairly large dials with widely- 
spaced scale divisions. In order to test the 
generality of these findings a second experiment 
was run using smaller dials (1.4 inch diameter) 
and finer scale division spacings (a scale mark 
for each unit of the 100 unit scale). With 
these exceptions the stimulus materials, ap- 
paratus, and general procedures were the same 
as for Experiment I. Five brightness levels 
were chosen: 0.005, 0.01, 0.05, 0.1, and 1.0 
foot-lamberts sampling in a somewhat different 
manner the approximate brightness range 
employed in Experiment I. 

Subjects were ten male students selected in 
the same manner as for Experiment I. Each 
subject was given five cards (of 12 dials each) 
to read as preliminary practice. On the 
formal trials each subject read five cards at 
each of the five brightness levels tested 
Since the data are based on the middle 10 dials 
of each card the results consist of 50 dial 
readings at each brightness level for each of 
ten subjects. 

Table 4 summarizes the mean error fre- 
quencies and the mean reading time per dial 
for the five brightness levels. The error 
frequency data are also presented graphically 
in Figure 2 (results for 1.4 inch dials) and the 
mean reading time data in Figure 3 (1.4 inch 
dials). 

A ¢ test analysis comparing performance 
between the various brightness levels, both for 
time and for error frequency means, is sum 
marized in Table 5. s ij 


134 


Table 5 


Values of ¢, Comparing Dial Reading Performance at 
Five Brightness Levels, 1.4 inch, 
100 X 1 Dials 


Brightness, in Foot-Lamberts 
0.005 0.01 0.05 0.1 1.0 


?’s between Error Means 


0.005 

0.01 3.37** — — — — 
0.05 11.48**  6.43** — = = 
0.1 10.66** 7.82%" 1.30 = = 
1.0 O53" 5.77** 1.43 0.57 — 

ts between Time Means 

0.005 = — — — -= 
0.01 1.67 

0.05 360" 5.85** = — — 
0.1 3.67** 6.25**  2.46* — — 
1.0 4.15% 7.10**  9.08**  4,28** _ 


* Significant at the 5% level. 
** Significant at the 1% level. 


The results presented indicate that as bright- 
ness level is increased dial reading accuracy and 
speed improve markedly up to 0.05 foot- 
lamberts, but above this level increments of 
improvement are much less. Analysis of the 
data shows that for errors there is no signif- 
icant improvement above 0.05 foot-lamberts. 
For time, however, there is significant improve- 
ment throughout the brightness range tested 
even though the absolute change is much less 
above 0.05 foot-lamberts. 

These results and inspection of the two sets 
of curves in Figures 2 and 3 indicate that the 
findings of both experiments are in essential 
agreement. Both the time curves and the 
error curves indicate that dial reading perform- 
ance becomes markedly poorer as brightness 
falls below the level of approximately 0.02 
foot-lamberts. At 0.005 foot-lamberts (which 
is slightly above cone threshold) it takes 
roughly three to three and one-half seconds to 
read a dial, and approximately two-thirds of 
the readings are in error. 

The fact that the performance curves suggest 
something approaching a plateau or, at least, 
a relatively gentle slope between the two 
lowest brightness levels for the 2.8 inch dials, 
whereas the curves for the 1.4 inch dials rise 
rapidly and steadily below 0.05 foot-lamberts, 


S. D. S. Spragg and M. L. Rock 


is believed not to be a serious discrepancy- 
The hint of a plateau or moderation of slope in 
the 2.8 inch dial data may have been affected 
by the choice of brightness levels, by some 
aspect of the 2.8 inch dials (such as absence 
of fine scale marks), or may have been a result 
of some aspect of the sampling. On the basis 
of Rock’s findings with a series of four widely 
differing visual tasks (11), and in view of the 
common-sense consideration that if brightness 
is pushed down to cone threshold and below the 
time and error scores in dial reading will 
certainly rise to very high levels, it would seem 
reasonable to hypothesize that the performance 
curves for the 1.4 inch dials more closely 
represent the situation in the region from cone 
threshold up to 0.05 foot-lamberts. 

For both types of dial evidence for a plateau 
or near plateau at values above the 0.02 
foot-lambert region is considerable. In both 
experiments performance increments become 
slight at brightnesses above this region. Vor 
the higher values the time required to read 
each dial is of the order of 1.3 to 1.5 seconds. 

Some discrepancy is seen between the error 
scores for the two sizes of dial at the three 
highest brightness levels. For the 1.4 inch, 
finely spaced dials the proportion of the read- 
ings in error is 10 per cent or less. For the 
2.8 inch, coarsely graduated dials the propor- 
tion is somewhat greater than 25 per cent. 
It is believed that this difference can be 
explained by the differences in scale marking’ 
between the two dials. Given sufficient 
brightness the 1.4 inch, 100 X 1 dials can Þe 
read with considerable accuracy because n° 
interpolation is required. The 2.8 inch dials, 
however, continue to require interpolatio? 
judgments at these levels of brightness as We 
as at the lower levels. At the lowest bright- 
ness level the small dials also probably requite 
a good deal of interpolation judgment because 
the minor scale marks have become difficult 0 
impossible to see. Hence the results for th® 
two kinds of dial agree closely at these levels- 


Discussion and Conclusions 


The results of the time scores and et!" 
Scores reported above indicate clearly that 
there is a critical level of brightness (abo" 
0.02 foot-lamberts) below which subjects f” 
it difficult to perform the dial reading task, ® 


Dial Reading Performance as Function of Brightness 


shown by relatively slow responses and greater 
frequency and magnitude of errors. Above 
this level the task becomes suddenly much 
easier, responses are quicker and frequency 
and magnitude of errors much less. Further 
increases in brightness, however (at least up 
to 6.0 foot-lamberts and very probably in- 
definitely), produce no further increments of 
performance. It seems as though once a 
subject has been given enough brightness to 
Perform this task with ease, brightness is no 
longer a significant variable. 

These findings have recently been corrob- 
orated by two studies from other laboratories, 
Teported after the present experiments were 
Completed, and from a later series of experi- 
ments by Rock (11) in this laboratory. From 
the Tufts laboratory Crook and his co-workers 

) have reported results on a task involving 
the reading of numerals with brightnesses 
tanging from 15 to 0.01 foot-lamberts. Their 

ndings for 10 point type size are in very close 
agreement, both for time and for error curves, 
With the comparable experiment reported by 

ock (11) and are in general agreement with 
Sur dial studies, Their results show that the 
Tegion of 0.02 to 0.04 foot-lamberts is critical 
or their visual perceptual task, performance 
Topping off sharply below this level but 
Showing little or no increment above this 
level 

„At the Princeton laboratory Kappauf and 
his colleagues (3) have carried out dial reading 
Studi l types, using bright- 


les on a variety of dia 
nesses ranging from 2.7 to 0.0009 foot-lamberts. 
the values 


though there is much variation in the e 
= Percentage of readings in error in their 
je ctiment the general shape of their ne 
Pa 1.4 inch diameter dials (both for errors n 

X time) agrees very closely with the results 


f y y 
CY 1.4 inch dials reported here (Experiment 


TD) in that they indicate a critical brightness 


eve] slightly above 0.01 foot-lamberts. 
Wo eir over-all results for 2.8 inch g 
uld suggest a critical brightness Jevel a 
Det t 0.007 foot-lamberts, a value somewhat 
b T than that found in our Experiment 


the 283 Tf, however, we plot their = 
“© inch 10 i licates of wht 
We ch 100 X 10 dials (dup ren 


“sed in our Experiment I) we 


allow; 3 
Wi o S b- 
ng for some irregularities in their 0! 


135 


tained percentages of readings in error, a 
smoothing of their 100 X 10 results would 
locate the critical brightness level at about 0.01 
foot-lamberts. This should be regarded as 
good agreement with our present results. 

These several findings are in interesting 
contrast with König’s classical curve relating 
acuity to brightness, and to the findings of 
Hecht and certain other recent investigators 
that acuity continues to increase with increases 
in brightness, even at very high brightness 
levels. Other workers whose data indicate that 
acuity ceases to increase beyond a certain 
brightness level have usually reported that 
their curves do not flatten out until an illum- 
ination of about 5 to 10 foot-candles has been 
reached. 

There is no fundamental discrepancy be- 
tween such findings and the present results. 
Acuity studies deal with threshold phenomena 
and relatively simple stimulus materials while 
our data are from a relatively complex visual 
task in which the digits and the pointer are 
well above threshold size. Performance is 
thus a function not so much of acuity as of 
speed and accuracy in making a visual judg- 
ment which often requires an interpolation. 
Hence the lack of close correspondence between 
our results and the earlier acuity studies should 
occasion no dismay. 

In our dial reading task one important 
variable is the effective or subjective contrast 
between white figure and dark background. 
It is true of course that contrast, defined 
physically, is independent of illumination. At 
our low brightness levels however there are 
obviously fewer j.n.d.’s of brightness between 
figure and ground than there are at brightness 
values which are great enough so that the 
psychophysical function is approximately con- 
stant. Some approximate calculations based 
on Blackwell’s contrast threshold data (1), 
and assuming a brightness ratio of 10 to 1 
between our stimulus figures and their back- 
ground, indicate that for a background bright- 
ness of 0.001 foot-lamberts a stimulus figure 
would have to have a brightness very close 
to 0.01 foot-lamberts to be at liminal contrast. 
This means that for a figure brightness of 0.02 
foot-lamberts the figure-ground contrast is 
approximately 2 j.n.d.’s. By comparison, at 


136 


a figure brightness of 0.1 foot-lamberts the 
figure-ground contrast would be roughly 7 
j.n.d.’s. Thus our critical 0.02 foot-lambert 
level represents a value which provides barely 
adequate subjective contrast; for values below 
this level performance suffers decrement due 
to insufficient subjective contrast (expressed 
in j.n.d.’s) while for values above this level 
contrast is sufficiently great that it no longer is 
a significant variable for the task in question. 

From a practical standpoint the results of 
the present study indicate that in visual 
perceptual tasks of this nature where maximum 
performance is desired with a minimum of 
brightness (in order, for example, to conserve 
dark adaptation) care should be taken to keep 
the brightness level safely above this critical 
region (0,02 to 0.05 foot-lamberts). 

These findings have implications for the 
night operation of civilian and military equip- 
ment such as aircraft, and in general for the 
viewing of complex visual stimuli at low levels 
of illumination. They indicate that if the 
visual material to be dealt with has a brightness 
safely above the critical value then visual 
perception will be as rapid and as accurate as 
it would be at higher brightness levels (at least 
up to 6 foot-lamberts, and possibly indef- 
initely). 

Two limiting conditions should be kept in 
mind in connection with generalizations and 
applications of the present findings. The first 
is that these data have been gathered on 
photographic reproductions of dials rather 
than actual dials. Thus a parallax error due 
to angle of viewing the dial is not possible. 
Contrast may not be quite as high as on 
instrument dials and reflections from a glass 
face are lacking. In spite of these differences 
it is believed that the function measured is of 
such fundamental validity that it will apply 
to many other dial reading and similar tasks. 

A second limitation to the present findings 
inheres in the fact that the data were taken 
under conditions in which fatigue effects were 
probably not a significant variable. It may 
be that for a long-continued task of this 
general nature a minimum brightness value 
should be recommended which would be higher 
than that suggested by the present experiment. 
Further research is needed to supply informa- 
tion here. 


S. D. S. Spragg and M. L. Rock 


Summary 


Experiments are reported on the speed and 
accuracy with which subjects can read photo- 
graphic reproductions of instrument dials as a 
function of the brightness of the dial markings. 

Young adult males, rigorously screened so 
that they constituted groups with excellent 
visual abilities, served as subjects in dial 
reading tasks. A brightness range of 0.005 to 
6.0 foot-lamberts was used. Both for time 
and for error frequency scores a critical 
brightness level was found at approximately 
0.02 foot-lamberts. At brightnesses below 
this level performance was increasingly im- 
paired; above this level increases in brightness 
produced little or no improvement in visual 
performance. , 

These findings suggest that for the night- 
time operation of equipment where dial-reading 
and comparable visual tasks are involved 
brightness values should be kept safely above 
the critical 0.02 foot-lambert level. As long 
as this is done visual performance will be as 
rapid and as accurate as at higher levels (i.e. 
brightness ceases to be a significant variable). 


Received January 24, 1952, 
Early publication, 


References 


+ Blackwell, H.R. Contrast thresholds of the huma” 
eye. J. opt. Soc. Amer., 1946, 36, 624-643. 

2. Carmichael, L., and Dearborn, W. F. Reading and 

visual fatigue. New York: Houghton Mifin, 

1947. 

- Chalmers, E. L., Goldstein, M., and Kappauf, W- p 
The efect of illumination on dial reading. usA 
‘Technical Report No, 6021, Air Materiel Co™ 
mand. August, 1950, 25 p. d 

~ Crook, M. N., Harker, G. S., Hoffman, A. Co 204 
Kennedy, J.L. Effect of amplitude of apparer 
vibration, brightness, and type size on numera 
reading. USAF Technical Report No. 6246, Ai 
Materiel Command. September, 1950. 54 P- 

5. Fulton, J. F., Hoff, P. M., and Perkins, Ħ- 4h. 

A bibliography of visual literature, 1939-194" 

Menasha, Wis.: Geo, Banta, 1945. d 

- Fulton, J. F., Marquis, D, G., Perkins, H. To 9 
Hoff, P. M. A bibliography of visual literati! 
1939-1944. Supplement. Menasha, Wis.: 1 
Banta, 1945, 

Hoff, E. C., and Fulton, J. F. A bibliography Ý 
vo medicine. Baltimore: Q c. Thom 


- Kappauf, W. E., and Smith, W. M. Desig" of 


Dial Reading Performance as Function of Brightness 


io E 

instrument dials for maximum legibility. H. . A 
preliminary experiment on dial sise and graduation. 
USAF Memorandum Report No. MCREXD- 
694-1N, Air Materiel Command. July, 1948. 
16 p. 

9. Kappauf, W. E., Smith, W. M., and Bray, C. N 
Design of instrument dials for maximum legibil: J- 
I. Development of methodology and some prelimt- 
nary results. USAF Memorandum Report No. 
TSEAA-604-1L, Air Materiel Command. Oc- 
tober, 1947. 42 p. 

- Lawrence, M., and Macmillan, J. W. £ dana 

bibliography on human factors in engineering G8- 
sign. Aviat, Br, Res. Div, BuMed., U. S. 
Navy, 1946. 3 

11. Rock, M. L. Visual performance as a geten af 
low photopic brightness levels. USAF T aae 
Report No, 6013, Air Materiel Command. * 
vember, 1950. 31 p. 


137 


12. Smith, W. M., and Kappauf, W. E. Studies per- 
taining to the design and use of visual displays for 
aircraft instruments, com puters, maps, charts, and 
tables: a bibliography. USAF Memorandum Re- 
port No. TSEAA-694-1G, Air Materiel Com- 
mand. May, 1947. 25 p. 

13. Spragg, S. D. S., and Rock, M. L. Dial reading 
performance as related to illumination variables. 
I. Intensity. USAF Memorandum Report No. 
MCREXD-694-21, Air Materiel Command. 
October, 1948. 29 p. 

H. Spragg, S. D. S., and Rock, M. L. Dial reading 
performance as related to illumination variables. 
III. Results with small dials.. USAF Technical 
Report No. 6040, Air Materiel Command. No- 
vember, 1950. 8p. 

15. Troland, L. T. The principles of psychophysiology. 
Vol. II. Sensation. New York: D. Van Nos- 
trand, 1930. 


Critique of Rock’s “A Sales Situation Test” 


Jack Bernard * 
The Du Bois Company, Cincinnati, Ohio 


Motivations for the publication of articles in 
scientific journals are unquestionably varied. 
One may publish out of desire to spread 
scientific findings, out of desire for self-adver- 
tisement, out of desire for advertising his 
wares (as in the case of new apparatus), or for 
any combination of these or other motives. 
Where quality and scientific accuracy are 
maintained at a high level resulting in the 
presentation of something which is of value, 
questions of motivation are academic. How- 
ever, where scientific accuracy is remarkable 
largely through its absence and exhortative 
conclusions are drawn which are founded on 
the quicksands of inadequate statistics, the 
question may be legitimately asked: Is it 
science or salesmanship? Rock’s article on his 
“Sales Situations Test” is a case in point. 

Rock informs us that the “sales situations” 
(items) in his test came from sales managers 
who were asked to make up situations calling 
for sales judgment. These were then edited. 
Whether coincidence crept in during the 
writing or during the editing is an open ques- 
tion, but 13 out of Rock’s 25 questions (more 
than 50 per cent of the test) wind up as para- 
phrases of questions in a “sales sense” test 
developed a number of years earlier by Can- 
field? For example: 


Rock, item 1: 


A salesman has difficulty in getting in to see 
his prospect. The executive's secretary refuses 
to admit him to see her employer. The sales- 
man’s procedure in this situation should be: 
Put his proposition in writing for the 
prospect. 
Interview a minor official and through 
him reach the prospect. 
Sell the secretary, hoping she will sell 
the boss. 

E Obtain the secretary's cooperation 
through favors. 


* Formerly Chief Psychologist with The Klei x 
tute for Aptitude Testing, Inc. ein Insti 
1 Rock, M. L. A sales situation test. J, a L N 
chal., 1951, 35, 331-332. ; ppl. Psy. 
2 Canfield, B. R. How perfect is your “Sales Sense”? 
New York: The Klein Institute for Aptitude Testing 
Inc., 1945. : 


Canfield, item 46: 


A salesman calling on a business executive 
experienced difficulty in securing an interview. 
The prospect's secretary refused to admit the 
salesman to her employer. The salesman S 
procedure in this situation should have followed 
which one of the following courses: 


(1) Interview a minor official and through 
him reach his superior. 

(2) Put his proposition in writing 
prospect. m 
(3) Sell the secretary, hoping that she w! 

sell the employer. 
(4) Obtain the cooperation of the secretary 
in getting an interview. 


for the 


Nor is this exceptional. 
equally parallel: 


The following are 


Items 


Rock: 2 4 6 7 8 911131519 21 22 
Canfield: 21 40 45 33 37 38 34 17 22 23 13 18 


When simple failure to give due credit 1 
regarded as a breach of ethics, what must We 
consider a claim of originality for paraphrase 
material? 

As for the attempt to justify statistically the 
“close” (this sales terminology seems to apply 
better than “Summary”), it would be only 
Christian charity to trust that “he knew not 
what he did.” 

Briefly: No reliability data are presented 
when: (a) low reliability is typical of the few 
tests available in this field; and (b) low reliabil- 
ity is the rule on questionnaire-type te 
containing so few (25) items. 

Statistics were computed and conclusion® 
drawn based on three samples of 25, 26, and: 
persons each. Why the rush to publish with 
notoriously unreliable small sample analyse 
when considerably larger populations were 
available? The notation, “Early publicatio® 
appended to the article might well rea 
“Premature publication.” 

As an offshoot of the above come question® 
as to the representativeness of the sample 
chosen. Are the “production superviso", 
representative of “non-salesmen”? This ss 


138 


Answer to Bernard’s Critique of Rock’s “A Sales Situation Test” 


very doubtful. Are the “consumer salesmen” 
and “industrial salesmen” groups typical of 
those categories? The present writer, having 
seen thousands of Canfield tests taken by 
salesmen, would expect the reverse of the 
difference reported by Rock. 

Where does the blame lie? With the young 
author who might understandably place the 
prestige of authorship above scientific caution 
and accuracy? With the harried editorial 
staff? And more important still, what can be 


139 


done to protect readers of scientific journals 
from the fallacy of “it is printed and therefore 
it is fact”? 

In the opinion of the present writer, the 
principal contribution made by Rock in 
publishing his “A Sales Situation Test” is to 
call our attention once more to a crying need 
for a more rigorous screening of articles 
submitted to the psychological journals. 


Received January 5, 1952. 
Published out of turn by the editor. 


Answer to Bernard’s Critique of Rock’s “A Sales Situation Test” 


Milton L. Rock 
Edward N. Hay © Associates, Inc., Philadelphia, Pa. 


Professional people are motivated to publish 
articles chiefly to make available to other 
Workers methods used and results secured on a 
problem, This is important so that men busy 
working in the field will have a body of data 
available to guide them in research problems 
in order that they do not re-do work that has 
already been accomplished. 

Gathering items for a test from people 
Working in the field is common sense but it is 
now apparent that along with this, there are 
also some possible disadvantages. Itis possible 
that some of the items secured in this manner 
may by accident duplicate items in other tests. 
This can only be avoided—in order to give 
due credit to the originator of the items—if 

eir tests are known and can be studied. 


anfield’s How Perfect is Your Sales Sense 


test js *. difficult to obtain. In 
P a test that 1s from 1945 to të 


Boing throu iterature 
gh the literatu , 
Present—Psychological Abstracts 2° well as 


: n) ing 
commercial literature—an article by Flemmg 


and Fleming! has come to light, in 7 aig | 


Mention t. But 
the Canfield test. 
no description, no data and no reference 4 
e test in Fleming’s bibliography- ‘ aS 
Ca ard’s criticism, I had no idea tha 
a a test existed. 
escribed in my report 0 
th y Se i 
Me method of constructing thi 


app leming, E. G., and Fleming, ©. W, Qualitative 
mor gach to the problem of improving solet ji 127- 
150, Y Psychological tests. l., 1939, £5 


f October, 1951, 
s test was to 


J. Psychol., 


obtain sales situation descriptions from my 
own experience and that of client organizations. 

For the record, Company No. 1 in my 
report was Scott Paper Company, Chester, Pa. 
Company No. 2 was a large and prominent 
company in the Middle West (name on 
request). Credit for help should also be 
given to National Drying Machinery Com- 
pany, Philadelphia, Pa., and U. S. Fidelity & 
Guaranty Company, Baltimore, Md., and 
other companies. 

The only possibility that any questions were 
based on items in the Canfield test is that one 
of these companies had used the Canfield test 
and constructed new items based on items in 
that test. This possibility has never before 
arisen and there has been no time to communi- 
cate with these sources and find out. If this 
happened, it is unfortunate. However, the 
situations which arise in salesmen’s calls are 
somewhatuniversal. Ifit were otherwise there 
would be small point in including them in a 
test of sales knowledge. The constant re- 
appearance in mental tests of the same situa- 
tions phrased differently is well known. 

Concerning the population, the article was 
intended to be an introductory article on the 
problem of salesmen selection and definitel 
stated the size of the samples and used Sul 
sample statistics. The production supervisor 
group was one of four of the same size tested 
over the past year and a half and the results 
on the battery of tests given to the other three 


140 


groups, as described in the article, indicate 
that this group is representative of the produc- 
tion supervisors of this company. The sales- 
men populations were as follows: We received 
31 out of a total of 34 from Company No. 2 
and in the Scott Paper Company, 25 were 
distributed geographically and we received 
them all. 

At this time, as mentioned in the article, we 
are trying to broaden the sample to include a 
variety of vocational situations in order that 
we may use a follow-up method to see its 
predictive value in the selection of salesmen, 
It may be interesting to note that another 
prominent company tested 17 industrial sales- 
men. The results showed a range from 17 to 


Donald G. 


Paterson 


32 with a mean of 23.6 and a sigma of 4.4. 
This shows no significant difference from, the 
technical salesmen as tested in Company No. 2. 

Messrs. Bernard and Canfield may be 
assured that if their test had been available, 
and if some of these questions were ou 
to mine, as they say, they would have receive 
due credit. In all probability, if their test a 
been available, we would have used it. er 
the disturbing part is that their test shou F 
have been used for six years, thousands 0 
them having been administered, and yet ee 
test does not appear in the literature in such 
a manner that it can be used by others. 
Received January 24, 1952, 

Published out of lurn by the editor, 


Editor’s Reply to Bernard’s Criticism 


Donald G. Paterson 


University of Minnesota 


Bernard’s criticism of Rock’s article included 
an attack on the editor for accepting Rock’s 
paper. Perhaps a statement of editorial 
policy is indicated. 

Standards of editorial judgment admittedly 
vary depending upon the subject matter, A 
paper factor analyzing a set of selection tests 
must meet a far stricter standard of judgment 
than one that breaks new ground in a field of 
applied psychology. In other words, if a 
paper opens up new territory for exploration 
the editor is inclined to accept it, not because 
of its technical excellence, but because it is 
likely to lead to new research in a new field. 


Such a paper is sometimes accepted in spite of 
shortcomings, 


The field of selecti 
and challenging one. 
data on weighted appli 


ng salesmen is a difficult 
In addition to published 


1 } application blanks, patterned 
interviews, intelligence tests, and measured 


vocational interests, severa] tests have been 
developed that purport to measure “sales 
sense,” “sales judgment,” or “sales aptitude.” 
At least one of these was developed and 
marketed as early as 1936, But authors and 
distributors of such tests seem to avoid 
describing them in the scientific literature, 


Buros’ Third Mental Measurements Yearbook 
(1949) contains a review of only one such E: 
and the facts disclosed about that test i 
most disheartening (see p. 704). For i 
reason, Rock’s article was accepted W! 
alacrity. Thus, it becomes the first teen 
this type to be put on top of the acien naa 
table for everyone to scrutinize. It is to ill 
hoped that this test, like its competitors, ME 
now be subjected to independent cross-vali 
tion. :mply 
Publication of articles does not es 
editorial endorsement. Neither does the P 
lication of Bernard’s criticism imply edio 
endorsement of his rather sharply w 
attack. Furthermore, the editor assumes ogy 
the readers of the Journal of Applied P. sych oie.” 
are sufficiently mature and sufficiently ©° a 
petent not to be easy victims of the fallacy ; 
“it is printed and therefore it is fact.” sane ny 
the present editor deliberately avoids i 
semblance of censorship because the ae 
“thought control”? in science is, to him, , 


e a olic® 
repugnant as “thought control” is in a P 
state. 


Received February 18, 1952. 
Published out of turn by the editor. 


` child 


Special Review 


Eells, Kenneth, Davis, Allison, Havighurst, 
Robert J., Herrick, Vergil E., and Tyler, 
Ralph. Tiitelligence and cultural differences. 
Chicago: University of Chicago Press, 1951. 
Pp. xii plus 388. $5.00. 

This volume is presented as “the first part 
of an extended study of cultural learning as it 
bears upon the solution of problems in mental 
tests.” It is a phase of the research program 
of the Committee on Human Development of 
the University of Chicago. Part MI, “A 

eport of the Field Study,” is drawn from 

Eells’s doctoral dissertation. Part I, also 

Prepared by Eells, isa summary and discussion 

of his dissertation findings. Part I includes 

Ve review or discussional chapters, three of 

Which are revisions of earlier journal publica- 

Hons, by the remaining co-authors listed above. 
The study originates in the preoccupation 

of the Chicago group with phenomena of 

Stratification and the social class structure of 
Merican society. Social class or status is 

Perceived as a crucial determinant of personal- 

ity and behavior in various life spheres, and 

48 the inhibiting or facilitating force in the 

’s development to adulthood. In the 

Study here reviewed, status, from which flow 

Cultural differences in experience and exposure, 

's considered as determinant of responses to 
© specific items of intelligence tests. Because 

Such tests are widely used in the managerial 

®Shects of our society, the authors are con- 

ced lest possible cultural bias give spuriously 

‘PW scores to children from low status nee 

ti.) grant the replicated evidence of correia- 

‘On between test scores and socio-economic 
®asures, but propose to question again the 
caning of these differences: are they genet” 

ally determined; are they environmentally 

determined } or ate they the result of cultural 
est items? 


jas į 5 
“Sin the content of specific t : 
lization of our 


t a time when fullest uti 
on) resources is more pressing tia, ou 
legit m standpoint of national surviva p <2 
un ay and pertinence of such inquiry > 

Hestioned. But the inquiry itself must > 
dispassionate and objective. These qual- 
he present 


ities 5 
are somewhat lacking in t 


volume. Havighurst and Davis, contributing 
three chapters to Part I, transmute assump- 
tions and hypotheses into foregone conclusions. 
Part I seems to this reviewer to contain 
particularly flagrant examples of special plead- 
ing, particularly in view of the inconclusive 
findings of the research study itself. Davis 
and Havighurst apparently assume: (1) that 
all methodological questions in the study of 
stratification have been solved; (2) that family 
status is the prepotent determinant of individ- 
ual behavior; and (3) that Eells’s results bear 
out their foregone conclusions, which is 
certainly not the case. 

Turning now to the research section of the 
volume (Part IIT), Eells undertakes to analyze 
the responses of a large number of pupils to 
items drawn from ten tests or subtests of nine 
widely-used intelligence tests. The testing 
was done in the schools in and around Rock- 
ford, Ilinois; approximately 5,000 pupils were 
included, almost equally divided between nine- 
and ten-year olds and thirteen- and fourteen- 
year olds. His basic sample represents well 
over 90 per cent of the population of children 
of these ages, but the analyses essential to his 
hypotheses involve only those pupils with 
parents at the clear extremes of his status scale: 
the younger group contains approximately 
225 high-status and 325 low-status pupils; the 
older group has approximately 235 and 358 
pupils in the respective status groups. Low- 
status “ethnic” groups drop out of the analysis 
early, since they prove to be similar to low- 
status “Old Americans.” 

Rockford was chosen as the experimental 
area after applying a set of pragmatic criteria 
which permit no conclusions about its represen- 
tativeness in a strict sampling sense. Sim- 
ilarly, in choosing the two age groups, practical 
factors again appeared to dictate the choice. 
These age groups were designed to show “any 
changes in status differentials” attributable to 
age, but no hypotheses regarding the develop- 
mental time or nature of such possible changes 
are set forth. = 

Status measurements are based on a modi- 
fication of the schema set forth in Warner. 


141 


142 


et al., Social Class in America: data on father’s 
occupation, parental education, house-type, 
and dwelling area were obtained from a 
parents’ questionnaire. These items, rated 
and equally weighted, yielded the Index of 
Status Characteristics upon which all the 
families of each age group were separately 
distributed. These distributions were then 
cut ‘“‘so that the high- and low-status ranges 
would be as nearly equivalent as possible to 
upper-middle class and lower-lower class 
groups.” It should be apparent that the 
entire sampling process is designed to maximize 
the chances of proving the hypothesis of 
cultural bias in test items. No middle group 
is used in the item analyses as a check on the 
assumption of cultural bias; no attention is 
paid to possible sex differences in item re- 
sponses of the two age groups; no attention is 
paid to the factorial composition of the two 
sets of tests to determine whether or not 
cross-sectional measures of the same ability 
domains have been employed. 

Two chapters are given over to the correla- 
tional and group difference analyses of total 
test scores in relation to ISC. The findings 
are as might be expected: test scores and ISC 
show significant correlations in the range .20 to 
.43; extreme groups on ISC scores show IQ 
differences of about eight to twenty points, 
depending on the test used; some curvilinearity 
exists for some of the tests in relation to ISC. 
It is important to note in this connection that 
the authors miss one point: if class position is 
the major factor in producing the correlation 
between intelligence and socio-economic data, 
the resultant correlations should be much 
higher than those ordinarily obtained. In 
actuality, socio-economic factors account for 
so little of the variance that other factors must 
be operative in producing intelligence test 
performance. 

The last eight of the twenty-three chapters 
contain the evidence crucial to the basic 
hypotheses: item analyses contrasting high- 
and low-status responses to items reached and 
attempted by 95 per cent of the group. As 
Eells himself states with commendable restraj 
x é int 
in Chapter XVI, “the findings wil] not be con 
clusive.” One must first understand clearly 
how many test items are actually . 


r st under scru- 
tiny. The original tests contained 967 Bosathle 


Special Review 


items; approximately one-third of them are 
eliminated from the analysis because they are 
not reached and attempted by 95 per cent of 
the pupils. “Unstable items” (too hard or m 
easy) are also eliminated. There are 33 
items studied in the younger age group; of these 
53 per cent are significant at the one per cent 
level, 10 per cent at the five per cent level, and 
37 per cent do not reach the five per cent level. 
In the older group, out of 324 items, eighty- 
eight per cent are significant at the one per cent 
level, three per cent at the five per cent level, 
and nine per cent do not reach the five per cent 
level. 

The summaries of these various chapters 
may be paraphrased to cover the findings 
regarding possible causal factors behind the 
status differences. Position of item responses 
shows inconclusive results as a possible caus 
factor. Symbolism (e.g., verbal items a 
geometric design items) shows inconclusl¥ 
results, since it was not controlled for “type ° 
question.” Type of question is handled a 
setting up fifty-six logically derived categorii 
of items (e.g., synonyms, opposites, analog 
etc.). When symbolism and type of one 
are simultaneously studied, the results ® a 
inconclusive. Level of difficulty of saa 
shows inconclusive evidence. With respec: 
to age, it is concluded that the higher prop? a 
tion of items showing significant status diffe 
ences among older children is due tO 


. e ja Is 
differences in the nature of the test mater 


ne 
rather than to “inherent differences M re 
Status characteristics of the pupils at ive 
two levels.” me 


This evidence appears conc ‘siS 
The last two chapters involve “the analy’ ” 
- «+ based largely upon a subjective proc® a 
of seventy-five items showing differences jhe 
the one per cent level in one or more © ms 
Wrong-answer responses and twenty-five wee 4 
showing “unusually large status differen. 
in the per cent of the two extreme § Bx 
groups giving right answer responses- ible 
planations are “hypothecated” where plaus! of 
but “the presence of such a large propor? ead 


unexplained differences should, howeve® -us 


to caution in accepting the idea that all 5 be 
differences on 


| test items can readily _ of 
accounted for in terms of the cultural 
their content” 


(page 357), 
Part IT, also i 


a 
3 as 
written by Eells, is listed 


Special Review 


summary of the field study and need not be 
reviewed in detail, except to point out that 
he deals with “common culture” and “own 
culture” (subgroup culture) as if the com- 
munalities and disparities of social groups m 
America are completely documented and 
accessible facts. 

_ It is manifestly impossible’ to dey 
impact on the social sciences of the work of 
the last two decades of those who have on 
with brilliant descriptions and imaginative 
insights the class problems in American society. 
Certainly there is new strength and scope 1 
our research because of this accumulated 
evidence. But descriptions and insights are 


143 


only first steps in research; skillful design and 
testable hypotheses are also needed. The 
sampling, theoretical, and design problems 
suggested within the body of this review raise 
serious doubts about the worth of this study. 
His mentors, it would seem, had a technical 
obligation to help him arrive at a better 
thesis design. Failing this, they had a moral 
obligation to revise their thinking in the light 
of his inconclusive findings. Failing either 
or both of these desiderata, the monograph 
should have been given a critical and thorough 
editing before publication. 
John G. Darley 


Universily of Minnesota 


Book Reviews 


Flesch, Rudolph. How to test readability. 
New York: Harper and Brothers, 1951. Pp. 
56. $1.00. 

This pocket-sized little manual on read- 
ability is remindful of a Culbertson contract 
bridge digest. It presents techniques, offers 
illustrations of how the techniques work, and 
shows that mere techniques alone are not 
enough. Flesch goes Culbertson somewhat 
better, however, in providing a series of 
questions and answers and in listing an excel- 
lent bibliography. The question and answer 
section is reminiscent of Gallup’s Guide to 
Public Opinion Polls. 

Early pages of the manual reprint most of a 
1948 article from this journal which presented 
Flesch’s revised readability formula. This 
how-to-do-it section is supplemented by eleven 
examples of material ranging from the Bible 
to the Adventures of Huckleberry Finn analyzed 
for their readability scores, In these examples, 
sentence breaks are indicated, personal words 
are bold-faced and quantitative values for 
each formula element are provided. These 
paragraphs can be of great usefulness as 
standards both for training analysts to use the 
formula and for checking accuracy and reli- 
ability of trained analysts. Two nomographs 
are provided as calculation aids. 

In a more qualitative section, Flesch lists a 
number of hints for raising readability. Many 
of these, such as knowing the characteristics 
of one’s audience, and rearranging words, 
sentences, and larger units in a piece of writing 
are independent of his formula and of the 
elements which the formula considers, Other 
hints, such as raising interest by the “ 
approach, finding simpler words, and bre: 
up sentences and paragraphs, consider w 
variables included in the formula. 
examples are provided, 

Among the 44 questions whi 
and answers one finds succinct 
reliability and validity problems, other read- 
ability formulae, the effect on style of using 
short words and sentences, and discussion of 
how the formula applies to adve 


i (eee ttising, news, 
technical, and lega! writing. The answers are 


you” 
aking 
riting 
Here too, 


ch Flesch asks 
discussions of 


144 


supported by liberal references to the bibli- 
ography. 

“OF Ae interest to the reviewer were 
indications that Flesch considers the m 
Interest portion of his formula more importan 
than the Reading Ease portion. In making 
this point, Flesch says that if the reader 1S 
genuinely interested in what he is reading, 
he may be able to work his way through long 
sentences and difficult words, but primer style 
will not lure a reader to a dull presentation 
It would seem to the reviewer, however, ha 
“genuine” interest is primarily related tO 
subject matter, as some recent newspaper 
readership studies have demonstrated. 
interest is the most important consideration 
it would seem that a subject-matter contia 
analysis approach to the problem of wa 
things interest what audience groups WOU t 
yield greater dividends. Of course if pk 
matter can be held constant, the proportion i 
of personal words and personal sentence 
assume greater importance. l 

All in all, this is a valuable little manua 
Its modest price and excellent content W! 
have wide appeal for all who are concerne 
with improved written communications. 

Robert L. Jones 


Human Resources Research Institute, 
Maxwell Air Base, Alabama 


Travers, Robert M. W. How lo make achira 
ment tests. New York: The Odyssey Pre?” 
1950. Pp. 180. n 

This little manual has been written a$ Je 

aid for teachers, in order to help them to em 

velop objective tests, and to provide th A 

with techniques for defining educational z 
The introductory chapter indicates tic 

modern tendency to attempt more system e 

and complete evaluation of all the outo 

of a course. The need for new-type © osts 
jective tests to supplement essay-tyP® aa 

m such an evaluation program is recog?” , 

A chapter concerned with the making now 

blueprint for an examination indicates, ps 

test content can be planned so as to have Í 


Book Reviews 


properly allocated in relation to all of the 
course objectives. 

Separate chapters provide discussions of 
the advantages and disadvantages of tests 
of the true-false, multiple-chioce, and com- 
pletion types. These discussions are supple- 
mented by detailed directions as to procedure 
in constructing such test items, with careful 
indication of pit-falls to be avoided. One 
chapter provides an elementary description 
of procedures in assembling, administering, 
and scoring tests, as well as discussion of such 
topics as the test-item file, the directions to 
Pupils, the correction for guessing, and use 
of machine methods in scoring. final 
chapter treats such topics as the significance 
Of test scores, ambiguity in grading, the 
Validity of achievement tests, and the use of 
item analysis. An appendix deals with sug- 
gested methods of scoring free-answer and 
essay-type tests. 

The a ee admittedly included much 
Material that is based on opinion. He’ has 
Certainly not over-sold the new-type oF ob- 
jective test. To the present reviewer he 
appears to be too ready to accept current 
Criticism of objective tests, and too ready to 

elieve in the asserted values of essay-type 
tests, 

a The main use of the book 
ees the novice i a 
: cti: y-type 

items, aed irentremely elementary aoe 
Sophisticated test worker will be annoyed by 
the treatment which emphasizes test ites e 
ìSolated bits of behavior, neglecting te 
Plications of test items aS samples, Signs, 


Signals 
, or symptoms. Harold D. Carter 


will no doubt be 
first attempts at 
objective test 


University of California, 
Berkeley 4, California 
Gulliksen, H. Theory of mental tests. H 
York: Wiley, 1950. Pp. xix + 486. S 
aSeveral good texts in tests and meas me of 
the © appeared in recent years: We item 
Co: Se books have included discussions © t an 
"struction and all have given at leas 


tice €ntary presentation of the role Oe tes 
th m psychological measurement. PU con- 
most part the recent books have 


cen 
trated upon describing am 


urements 


a evaluating 


145 


existing psychological tests. Here, then, is 
a book which breaks with the current tradition. 

Gulliksen is not concerned with existing 
tests. (The Stanford-Binet, for example, de 
not appear in the index.) Rather he is in- 
terested in presenting the mathematical and 
statistical bases of test construction. In 
this respect the treatment is something like 
Thurstone’s early (1931) Reliability and Valid- 
ity of Tesis (upon which Gulliksen has admit- 
tedly drawn). But with twenty additional 
years of research and advancement in the 
field of mental measurement, Gulliksen can 
and does go beyond Thurstone. 

“The basic theoretical material on accuracy 
of test scores is presented in Chapters 2 
through 5, which deal with the topics of test 
reliability and the error of measurement. 
The effect of test length upon reliability and 
validity is considered in Chapters 6 through 
9, and the effect of group heterogeneity on 
measures of accuracy in Chapters 10 through 
13. . . . Practical problems of criteria for 
parallel tests are given in Chapter 14, and 
experimental methods of determining reli- 
ability when a parallel form is not used are 
considered in Chapters 15 and 16. Methods 
of scoring, scaling, and equating tests are 
considered in Chapters 18 and 19. Problems 
dealing with batteries of tests are considered 
in Chapter 20, and problems of item selection 
in: Chapter 21” (p. 5). 

An appendix contains a table of the normal 
curve, basic equations from mathematics and 
statistics, and sample examinations in statis- 
tics and test theory. 

It is clear that Gulliksen intended his book 
as a text; yet it seems to this reviewer that 
until such time as psychology departments 
see fit to strengthen their requirements in 
mathematics, the book is going to prove to 
be more valuable as a reference than as a text 
The student who has had a good year’s nines 
in statistics, including the analysis of variance 
who knows his algebra, analytic geometry, 
calculus, matrices and determinants, will find 
this an excellent and profitable book to study— 
as indeed it is. But it will not go well with 
students who lack the preparation to com- 
prehend it—even if the instructor should pick 
and choose among the various chapters as 
Gulliksen suggests. 


146 


Regarding the book as a reference work 
rather than as a text, it should be a welcome 
addition to the bookshelf of the professional 
worker in the field of mental tests. From this 
point of view Gulliksen has given us a major 
contribution and one that will be with us for 

ime to come. 
= Allen L. Edwards 


The University of Washington 


Freeman, G. L., and Taylor, E. K. How to 
pick leaders. New York: Funk & Wagnalls, 
1950. Pp. 222. $3.50. 

Written for “those selecting young men for 
executive training, as well as the aspirants 
themselves,” How to Pick Leaders “attempts 
to distill out of past and current research, the 
common elements of the leadership pattern. 
It then goes on to indicate how such a pattern 
can be employed to improve the search for 
executive talent to eliminate . . . the vagaries 
of unscientific selection practices.” 

The book begins with pointing out the 
inefficiency and high cost of most of the present 
day methods of selecting executive trainees, 
Next the criterion problem is discussed, stress- 
ing its importance and a number of ways of 
measuring leadership success are suggested. 
Following a section on the building up and 
administration of a scientific selection program, 
the main portion of the book is devoted to 
selection tools and techniques. Included here 
are recruiting and screening, interviewing, 
aptitude testing, consideration of past perform- 
ance, personality measurement, and rating. 
The book concludes with an overview of the 
total selection program and a section on the 
importance of continuing follow: 


-up of those 
selected. 

The authors have done an especially good job 
of bringing together rese 


number of varied sources, 
find, in a book written for t 
the authors respect their r 


earth way in which the bo 
to insure that it will be und 
whom it was intended. 

examples and illustrations 


Book Reviews 


into subject matter that might otherwise be 
dry or academic sounding for the lay reader. i 

The reader with a knowledge of the litera 
ture in this area will find an overly optimistic 
tone to the book which is not warranted on 7 
basis of research evidence. For example, er 
relative lack of success to date reported T 
researchers who have been seeking a rela 
criterion does not justify the statement, Any 
company has the means at hand for er 
true objective measurements of relative lea ae 
ship success.” Neither does research evilan 
back up the implication in the book that va a 
tools to include in the selection program r 
readily available or fairly easy to nearer 
Unfortunately, because of this optimistic saa 
the lay reader is likely to picture the proe 
a relatively highly developed one from whi 
immediate results can be expected. het 

Nevertheless, this book does bring tee 
into a single volume almost all of the promete 
tools and techniques for the scientific sale e 
of executives. The result is an understanda T 
and fairly complete source of information if 
the lay reader on this important area 
selection in business and industry. 

Theodore R. Lindbom 


Prudential I usurance Company of America, 
Newark, New J ersey 


Vernon, P. The structure of human ab. 
New York: John Wiley and Sons, Inc., G 
Pp. 160. $2.75. wl- 
For those, like the reviewer, whose Kno é 

edge of factor analysis derives almost arene? 

from Looks rather than from journal aT 
this latest book of Professor Vernon’s ShO < 
be invaluable. Many years ago Bun sof 

tempted to reconcile the opposed theor!® t 

Spearman and Thurstone by suggesting ory 

hierarchical theory of human abilities; a tP° er 

inspired by Spencer. Like many 80° 


ite 4 un 
conciliator in many another field, Burt fo 


. ow 
himself attacked by the rival schools. mo” 
in an appendix to the present book, vi Jus 
gives his reasons 


for preferring the Genera’ P ne 
or Hierarchical Theory t° ome 
1 Theories. Unfortunately © 4 
as Hotelling’s, which seem scal 
atisfactory to the mathem? 


A š ing 
are dismissed as not justify ‘ 


Group Factor, 
Multiple Facto 
theories, such 
be the most gs 
Statisticians, 


Book Reviews 147 


the extra effort of calculation. Perhaps the 
best argument in favour of the Hierarchical 
Theory is that for practical purposes a measure 
of g is essential; and so, whether they like the 
British Theory or not, most American educa- 
tionists, personnel psychologists, and others 
who use psychological tests, do in fact measure 
8 even though they may give it a capital 
letter, ` 

Vernon has attempted the frightening task 
of reviewing; “Almost all the contributions 
from about 1935 to 1949,” and the more 
important works. Much information has 
been reworked to conform with the Hierarchical 
Theory. Tt is not surprising to find that the 
reworked data fit quite well. By the time 
that the reader has finished this book, which 
has been written with great fairness, he will 
Probably find that factor analysis has much 
less to offer than many of its exponents would 
have him believe. Apart from g the only 
Teasonably well-established factors are vied 
(verbal, educational, numerical), km (prac- 
tical, mechanical, spatial), and the X factor, 
Which seems to be a complex affair categorizing 
Motivation, This X factor has an air 0 
\ntouchability about it, which its importance 
elies. Reviewers are warned not to criticize 
adversely authors for not doing something 
Which they did not set out to do, and it 1s 
true that Vernon did not “. » + attempt to 
Cover studies of personality factors, attitudes 
and interests, or other fields outside abilities. 
s evertheless when we find repeated rieren 
to the importance of this X factor, n 
Warnings that it affects test scores toa mra 
degree, we are justified in asking for a ppe 
discussion of this factor. How is it measured: 


How can we calculate its effect on scores? 
Can it be altered? 

For many, the most interesting chapter will 
be that on Occupational Abilities. Evidence 
for broad factors of manual, or finger dexterity 
is lacking, so the use of various tests aimed at 
measuring a non-existent factor should be 
discouraged. The practice, quite common in 
Great Britain, of making a standardized work 
sample for selection purposes is wholly justi- 
fied. Again, memory seems to be a collection 
of specifics, which do not unite to form a group 
factor (is X intruding here too?), wherefore 
then the countless maze experiments of the 
rat men. The separate fields of psychology 
are still remote from each other. The German 
psychologists are treated rather roughly, too 
roughly, for, if factor analysis yields results 
differing according to the method used, then 
it is not surprising that the Germans have 
found results which differ from those of factor 
analysis. Some reference to work on produc- 
tive thinking such as that of Duncker or 
Wertheimer would have been welcome. Many 
tests can be treated by transposing the problem 
into different media of thought. We know too 
little of the effect of this. 

The book deserves to be bought (the price 
is very reasonable), and read. The prose is 
readable, although Vernon stoops to such 
horrible words as “stimulatingness.”” It should 
certainly help clear the air, and give those 
whose work lies more in the applied, than in the 
theoretical field, a clearer view of what factor 
analysis has done. 

Douglas Irvine 


National Institute of Industrial Psychology, 
London, England j 


New Books, Monographs, and Pamphlets 


Books, monographs, and pai X Paters Editor, 
P isti i view ld be sent to Donald G., Paterson, 
” 1 pamphlets for listing and possible review shou l all G1 
ae : D earth ent of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Lysis of reasoning tests. Dorothy C. Adkins 
see Sore rf Lyerly. Chapel Hill: University of 
North Carolina Press, 1952. Pp. 122, $2.00, 

An introduction to projective techniques. Harold H, 
Anderson and Gladys L. Anderson, Editors. New 
York: Prentice-Hall, Inc., 1951. i Pp. 720. $6.75. 

Community planning for human services. Bradley Buell 
and associates. New York: Columbia University 
Press, 1952. Pp. 464. $5.50. 

Childhood problems and the teacher, Charlotte Buhler, 
Faith Smitter, and Sybil Richardson. New York. 
Henry Holt and Co., 1952. Pp. 372. $3.75, 

How much do you know about alcohol. Thomas R. 
Carskadon. New York: Association Press, 1951, 
Pp. 31. $.10. 

Psychology in the service of the school. M, F, Cleugh, 
New York: Philosophical Library, 1951, Pp. 183. 
$3.75. 

Changing attitudes through social contact, Leon Fes- 
tinger and Harold H. Kelley. Ann Arbor: Publica- 
tions Department, Institute for Social Research, 
University of Michigan, 1951. Pp. 83. $1.50, 

The art of clear thinking. Rudolf Flesch. New York: 
Harper and Brothers, 1951, Pp. 212. $2.75, 

Fundamentals of social psychology. Eugene L. Hartley 
and Ruth E, Hartley. New York: Alfred A. Knopf, 
Inc., 1952. Pp, 832. $5.50. 

Group treatment in bsycho-therapy, 
and Lydia Hermann, 
Minnesota Press, 1951, Pp. 136. 

Speech training. A, Musgrave Horner, 
Philosophical Library, 1951, 

Human factors in management, Revised edition, Schuy- 
ler Dean Hoslett, Editor. New York: Harper and 
Brothers, 1951, Pp. 327. $4.00, 

Thinking. An introducti 7 
ogy. George Hum 


and Sons, Inc., 1951, Pp. 331. $4.50. 


Cerebral mechanisms in behavior, Lloyd A, Jeffress, 


Editor. New York: John Wile andS I 
Pp. 311. $6.50, 7 ons, Inc., 1951, 


Changing the altitude 
Enoch Kagan. 
Press, 1951. 

The prediction i 
E. Lowell Kelly and Donald W. 

ess, 1951, 


development, 
arper and Br 


Robert G. Hinckley 
Minneapolis: University of 
$3.00. 


New York: 
Pp. 176. $3.75, 


experimental psychol- 
New York: John Wiley 


of Christian toward Jew, 


New York: Columbia Uni 


versit; 
Pp. 155. $2.75, y 


New York: H: Raymond G. 
Pp. 642. $5.00. 

Sizing up people. Donald A, Laird and 
Laird. New York: McGraw-Hil] Book 

Pp. 270. $3.75. 

The retarded child. Herta Loewy. 
sophical Library, 1951. Pp. 160, 

The psychology of human lea 


Eleanor C, 
Co., Inc., 


New York: Philo- 
$3.75. ‘ 
rning. John A. McGeoch 


others, 1951, . 


and Arthur L. Irion. a oe 

and Co., Inc., 1952, Pp. 596. $5.00. Uni- 
Argument of laughter. D. H. Monro. et ae 

versity Press; New Yor! Cambridge Un 

Press, 1951. Pp. 264. $3.75. orsa 
Readings in personnel administration. Poal Pin ae 

Charles A. Myers. New York: McGraw- 

Co., Inc., 1952. Pp. 483. $4.50. jldren. Otto 
Social science and psychotherapy for chi dre dations 

Pollak, et al. New York: Russell Sage Fou 

1952. Pp. 242. $4.00. ibeta 
A laboratory manual for social psychology. 1951. PP 

Ray. New York: American Book Co., 1951. 

173. $3.00. enema 
Children who hate. Fritz Redl and Doa mog 

Glencoe, Ill.: The Free Press, 1951. SASO uia 
The psychology of adolescence. Alexander S >` Pp. 55 

Milwaukee: Bruce Publishing Co., 1951. 

$4.00, ' 3. Sent, 
Problems of infancy and childhood. Milton ett 

Editor. New York: Josiah Macy, Jr. Fo 

1951. Pp. 181. $2.25. E Ei 
Symposium on the healthy personality. a Pie: n 

Senn, Editor, New York: Josiah Macy, J". 

dation, 1950, Pp. 298. $2.50. acher+ 


Curriculum deve lication’, 
New York: Bureau of Publica g 


-reon 
New York: Longmans, Gree! 


. p tea 
lopment as re-education of the 
George Sharp. 


Teachers College, Columbia University, 1951. 

132. $3.50. Carroll L 
Occupational information. Second edition. 952. PP- 

Shartle. New York: Prentice-Hall, Inc., 1994 

448. $5.00, Taba, eb Be 
Diagnosing human relations needs, Hilda Taba, 


ion; 
rae ; educall 
Washington, D, C.; American Council on Ed 


1951. Pp. 155. $1.75. rork: OF 
The study of instinct, N. Tinbergen. Nev ip, F 
ford University Press, 1951, Pp. 228. oe New 
eaching elementary reading. Miles A. Tin 952. Epi 


York’: Appleton-Century-Crofts, Inc., 
366. $3.50, 


ne 
>, Vaught 
ersonal and social adjustment. Wayland F. bee 592. 
he York: The Odyssey Press, Inc., 1951. 
25 


25. ren 
Student personnel work in college. C. Gibo Y L15, 
New York: Ronald Press Co., 1951. Pp. 58 “pail ee 
Productivity, Supervision and morale among rsitY a 
workers. Survey Research Center, Univers gess 


Michigan. Ann Arbor: University of Michiga™ A 
1951. $1.50, co 6% 
Selecting supervisors, United States Civil Servic’ | ot? 

mission, Washington 25; D.C. Superi ara 195 3 
- Government Printing Offi 


nships. 


emy of Medicine, 1951, 


Pp. 75, 
request, p 


148 


* Journal of Applied Psychology 


VoL. 36, No. 3 


JUNE, 1952 


Attitudes Toward Older Workers * 


Jacob Tuckman and Irving Lorge 
Teachers College, Columbia University 


In business and industry there are significant 
restrictions in the hiring, upgrading, and 
Fetention of older workers, i.e., men and 
Women 45 years of age and over. These 
Testrictions vary with occupation, industry, 
Worker characteristics and the condition of the 
abor market (1). The problem of the older 
Worker has been an especially pressing one 
during Periods of rising unemployment. soe 
ever, even during periods of full cama toe 
restrictions in hiring older workers are re! a 
general only when the supply of ‘ate oe 
bea, presumably more desirable workers 
een exhausted. ee 

any reasons have been given to ie 
the reluctance of employers to hire olde 


Workers, Tt is claimed that older workers are 


Slow, į ion costs, have a higher 
.? Mcrease production costs, menue 


Accident rate, are a poor investment, 
Younger supervisors, sat new procedures ZA 
pork methods, have a higher rate of pr A 
>» are hard to get along with, etc. ; E 
Very little evidence to justify these compi 
ere is little doubt that F i 
Up physiologically as he 2 
: Wever, ihe Shital changes that occur ss 
ss are gradual, and vary widely a aay 
aeViduals, These physical changes al 
ies the production of workers on jobs ney : 
P Ysica] Strength, energy and sp effect of 
aium, but little is known of er, F PE 
the Changes on job performance. je i5 
ind, Ilder workers are more come “we 
im ial accidents or more prone to & sen A 
tecen, 2Ot corroborated by the facts. Ea 
nt study of 17,800 workers ranging ¥ 
* 


, 
è 


Reti k . Spon- 
Pred “rement and Adjustment Series: Nu mber 3 


i f It Educa- 

tio, Operatiy, 1 Institute of Adu ; 

Teagythd the Ngee "Psychological Resear 
TS College, Columbia University- 


from under 20 to 74 years, in 109 manufactur- 
ing establishments, Kossoris (2) concludes, 
“The only disadvantage of older workers .. 
is that their disabilities last longer once they 
are injured. But they are on the whole less 
likely to be absent as frequently and perhaps 
less likely to be injured than younger workers.” 

While little is known of the relationship 
between physiological changes and the ability 
of the worker to function on the job, still less 
is known about the personality characteristics 
and reactions of workers as they get older and 
how these characteristics and reactions affect 
job performance. Although a body of opinion 
exists to indicate that older workers are unable 
to work under younger supervisors, resist 
changes in work methods or the introduction 
of new machinery, are difficult to work with, 
etc., supporting evidence is not available. 

The purpose of this study is to investigate 
the extent to which graduate students subscribe 
to the commonly held beliefs about older 
workers. A questionnaire of 51 statements on 
the older worker was developed. These state- 
ments were classified into the following 9 
categories:! Physical, Mental, Resistance to 
New, Reaction to Criticism, Keeping Youth 
Down, Employer Attitudes—Costs, Waiting 
for Retirement, Interpersonal, Jobs. The 
material for these statements was obtained 
by interviews with employers, employment 
counselors engaged in job placement work 
with older applicants, public employment 
service officials, directors of Private agencies 


1 The statements were first classified independent 
by each of the authors. For those statements on which 
there was no agreement there was further discussion. 


149 


150 


engaged in job placement activity, job appli- 
cants, and by a review of the literature. 

The directions employed and several state- 
ments to illustrate the form of presentation 
follow: 


Directions: Below are statements about older workers. 
If you are in general agreement with these 
statements, put a circle around the Yes. 
If you are in general disagreement with 
the statement, put a circle around the No. 
Answer all questions. If you are not sure, 


guess. 

Yes No 1. Older workers fail in emergencies. 
Yes No 17. They have no ambition. 

Yes No 


47. They keep younger men from get- 
ting ahead. 


The questionnaire was administered to 147 
graduate students (92 men and 55 women) 
enrolled in a course on the psychology of the 
adult at Teachers College, Columbia Univer- 
sity. There was no time limit, but approx- 
imately 10 minutes were required for the 
completion of the questionnaire. For men, 
the age range was from 20 to 48 years with a 
mean age of 29.5 years; for women, from 20 to 
51 years with a mean age of 33.3 years. The 
mean age of the combined group was 30.9 
years with a standard deviation of 7.2 years. 

The mean score (the number of yes responses) 
and standard deviation by sex and age are 
presented in Table 1. Only two age categories 
were set up, under thirty years of age and over 
thirty years of age, because the group was not 
large enough to warrant a larger number of 
age categories. The mean age of the group 
under thirty years of age (46 men and 23 
women) was 24.6 years, with a standard 
deviation of 2.2 years. The mean age of the 
group over thirty (46 men and 32 women) was 


Table 1 


Mean Score and Standard Devia; 


tion o; 
Questionnaire by n Older Worker 


Age and Sex 
N Mean S.D. 
Age Group 
20-29 yrs. 69 143 
30-51 yrs. 78 131 = 
Sex 1 
Male 92 125 
Female 55 15.6 BA 


Jacob Tuckman and Irving Lorge 


36.5 years with a standard deviation of i 
years. The group under thirty years of d 
tends to score slightly higher on this question 
naire than the group over thirty years of age: 
but the difference is not statistically reliable 
(critical ratio = 0.84). Women tend to | 
higher than men. The critical ratio of 2 
between men and women indicates that i 
difference is significant at the 5% level © 
confidence. 2 
An item analysis was made of the question 
naire to determine which statements show a? 
age or sex difference as well as to determine 
the proportion of Teachers College gett 
students, by sex and age, in agreement W! 
each of the 51 statements about older moken 
and the categories into which these statemen 
were classified. This information is give? ' 
Table 2. ce 
The data in Table 2 indicate the agg 
of age differences. In comparing the ee 
age groups there is only 1 statement significl er 
at the 5% level of confidence. A hig x 
proportion of students under thirty yea rlder 
age than those over thirty agree that © a 
workers will not take on additional or ee 
bilities. In comparing men and women n a 
are 7 statements significant at the 5% leve aen 
confidence. A higher proportion of worth 
agree that older workers fail to keep UP er 
changing methods of work, are in & ai 
cannot supervise others well, cannot cone th 
trate, are not physically able to keep UP alle! 
the work, are difficult to work with. a ket3 
proportion of women agree that older WP” 41. 
get occupational diseases more ie 
though the number of statements show a 
significant difference between men and weet 
is not materially greater than a chance exp one 
tion, the difference appears to be 4 re ial 
since 80% of the test items show 2 ™ 
percentage of agreement for women. f agre” 
For the total group the percentage °% ‘gion 
ment with the 51 statements in the 90° jth 
naire ranges from 0 per cent to 78 pet ©" ent: 
a mean percentage of agreement of 27 per uP 
From 70 per cent to 78 per cent of the Bet! 
subscribe to the belief that older worker’ a g 
more time to learn new operations, take nb 
in getting over illness, take longer 9 87 cen 
over injuries. From 63 per cent to 68 P Mi 
subscribe to the belief that older wor 


ner 


Altitudes Toward Older Workers 151 


Table 2 


Proportion of Teachers College Graduate Students, by Age and Sex, in Agreement with Older Worker 
‘Statements as Classified into 9 Categories 


Age Group Sex Total 


20-29 30-51 Male Female 
N=69 N=78 N=92 N=55 N=147 


Category Statement % % % 4 % 
Resistance to 5. They fail to keep up with changing methods 
New of work. 41 32 28* 49 36 
38. They resist new ways of doing things. : 59 56 52 67 38 
42. They are suspicious of labor saving machines. 42 44 40 47 T 
43. They look to the past. 71 65 68 67 68 
Mean Resistance to New 53 49 47 58 si 
Physical 2. They are slow. ’ 65 67 65 67 66 
7. They have a high rate of absenteeism. 10 6 11 7 5 
24. They need longer rest periods more often. 58 59 53 67 59 
26. They have accidents often. 13 18 14 18 16 
27. They get occupational diseases more often. 30 21 33"* 43 35 
36. They are not physically able to keep up n 3 ‘ab a 
with the work. , 24 
39. They take longer in getting over illness. 74 79 17 76 7 
46. They take longer in getting over injuries. 81 76 78 78 78 
45 
Mean Physical 44 43 46 44 
React ion to 30. They will not carry out plans assigned by i : ; 
Criticism supervisors. — Fi 4 
ke criticism WI 
33. PE ai take criti P z A 3 3 
A g eT- 
: dislike to work under younger sup’ 
37 TE isli s8 i en m 3 
Mean Reaction to Criticism 31 27 27 36 z 
Mental 1. Older workers fail in emergencies. A p = = 15 
i ves. 
6. They are unsure of thems V3 3824 23 44 i 
8. They are in & rut. rs 6 3 A “ 
9. They make many errors. 33 49 
10. They get rattled when goret 5 A K F ar 
14. They show poor judsme™ ing theirwork, 9 5 5 9 
19. They have difficulty in planning 9 6 3* 15 $ 
28. They cannot concentrate in Peene 
35. They need more time t9 53 (72 m1 728 74 
tions. z 55 46 
41. They are slow to catch ree mE 47 56 51 
45. They are mentally unable i 4 ‘ ; ; 
the job. n 19 21 A i> a 
51. They have limited skill. e ” 
1 19 28 m 


ko 
oa Significant at .05 level. 
nificant at 01 level. 


Jacob Tuckman and Irving Lorge 


Table 2—Continued 


Category 
Keeping 


Youth 
Down 


Employer 
Attitudes— 
Costs 


Waiting for 


Retirement 


Interpersonal 


Jobs 


Age Group Sex 
20-29 30-51 Male Female 
N=69 N=78 N=92 N=55 
Statement % % % % 
4, They take jobs away from younger workers. 22 21 18 a 
13. They get all the breaks. 1 3 2 
15. They take credit for the work done by 6 

younger men. 9 14 3 1 
32. They are critical of younger workers. 61 56 57 62 
47. They keep younger men from getting ahead. 23 18 18 24 
Mean Keeping Youth Down 23 22 21 26 
12. They increase production costs. 23 15 14 27 
21. They are paid too much for the amount of 

work they do. 4 1 2 4 
25. They increase costs of pensions for em- , 

ployers, 48 49 48 49 
31. They spoil much of their work. 1 0 0 2 
34. They do not produce as much as younger 

workers, 33 35 33 36 
Mean Employer Attitudes—Costs 22 20 19 24 
16. They are interested only in putting in their 

hours. 9 4 7 5 
17, They have no ambition. 4 3 3 4 
22. They will not take on additional responsi- 

bilities. 17* 6 9 16 
23. They just wait for retirement. 10 4 7 9 
29. They are interested more in security than 

job advancement. 71 56 60 69 
Mean Waiting for Retirement 22 15 17 21 

3. They cannot win the confidence and loyalty 

of fellow workers, 6 3 2 A 
18. They are unable to smooth out disagree- 

ments between other workers. 9 4 3 ii 
20. They cannot supervise others well. 10 8 4* 16 
40. They are critical of their fellow workers, 29 37 29 40 
44. They are suspicious of other workers, 16 22 18 20 
48. They are difficult to work with, 10 14 5** 24 
49. rg cannot listen to other People’s com- 

Plaints without Setting irritated. 7 12 8 13 
Mean Interpersonal 12 14 10 19 
11. They lose jobs often. 

iis) x 32 19 24 
50. They quit jobs frequently, 0 0 ” 0 
Mean Jobs 
" y> E 16 10 13 12 
ean ‘ategori 
€gories 28 25 4 30 


Total 


N=147 


‘0 


21 


0 E ————— SSS onmcvhys:' 


Altitudes Toward Older Workers 153 


slow, are more interested in security than 
advancement, look to the past. Less than 
5 per cent of the group subscribes to the belief 
that older workers cannot win the confidence 
and loyalty of fellow workers, get all the 
breaks, have no ambition, are paid too much 
for the work they do, will not carry out plans 
assigned by supervisors, spoil much of their 
Work, quit jobs frequently. . 
he percentage of agreement with the 
various categories into which the statements 
Were classified varies from 13 per cent to 51 
Per cent. The group subscribes more to 
Statements covering resistance to new ideas 
and Procedures, and physical changes than to 
Statements covering interpersonal relationships 
and job shifting. The mean percentage of 
agreement is 51 per cent for Resistance to 
ew, 44 per cent for Physical, 29 per cent for 
action to Criticism, 23 per cent each for 
ental and for Keeping Youth Down, 21 per 
cent for Employer Attitudes—Costs, 19 per 
Sent for Waiting for Retirement, and 13 per 
Cent each for Interpersonal and Jobs. 
he differences in the proportion of agree- 
Ment with the statements in the questionnaire 
the categories into which they had been 
Classified may be due to a reluctance on the 
Part of the subjects to subscribe to statements 
Which penalize the older worker. This 3s 


evident by the lack of consistency between 
statements in some areas. For example, 
although 75 per cent of the group agree that 
older workers are slow, 55 per cent agree that 
they need longer rest periods more often, 
25 per cent that’ they have limited skill, 25 
per cent that they are physically unable to 
keep up with the job, and 13 per cent that 
they increase production costs, only 3 per cent 
of the group agree that older workers are 
paid too much for the work that they do. 

It is evident from the data that there is 
considerable acceptance of erroneous ideas 
about older workers. This is even more 
surprising when the educational level, previous 
training in psychology, and the interest of the 
group in the older adult as evidenced by 
enrollment in a course dealing with the aging 
process, are taken into consideration. The 
study indicates that there is a need for more 
data to prove or disprove the prejudices and 
misconceptions about the skills, abilities and 
personality characteristics of the older worker. 


Received June 11, 1951, 


References 


1. Older workers at the public employment service, 
The Labor Market, June, 1949, pp. 31-37, 

2. Kossoris, M. D. Absenteeism and injury experience 
of older workers. Monthly Labor Review, July, 
1948, pp. 16-19. 


Attitudes Toward the Em 


ployment of Older People * 


Wayne Kirchner, Theodore Lindbom and Donald G. Paterson 


Industrial Relations Center, University of Minnesota 


The Industrial Relations Center at the 
University of Minnesota has undertaken a 
series of studies of the older employee in busi- 
ness and industry. It was recognized that 
attitudes toward older persons in the labor 
market might be important determinants in 
their efficient utilization. Furthermore, the 
availability of an attitude scale on the subject 
would make possible empirical studies of the 
extent to which different groups in employment 
relationships (managements, unions, older 
workers, younger workers, etc.) hold similar 
or diverse attitudes toward the older worker. 


Development of the Attitude Scale 


A total of 53 items was drawn up on the 
basis of a survey of the literature plus the 
gathering of opinions about the older employee. 
Examples include: 


I think older employees have fewer accidents 
on the job. 


Older employees are harder to train. 


The Likert technique of summated ratings 
was employed (1, 2, 3). 

The pool of 53 items was administered to 
42 senior college and graduate students in a 
Vocational and Occupational Psychology class 
and to 38 supervisors enrolled in an Extension 
Division course in Supervisory Training. No 
significant difference was found in mean score 
or ø for these two classes. 

The Scale Value Difference technique (SVD) 
as discussed by Rundquist and Sletto (4, Ch. 
10) was then utilized in item analysis for each 
class separately (top and bottom quarters on 
total score). 

The results revealed the need for redefining 
many of the items as well as the elimination of 


* Aid of the following persons is 
edged: Dr. Dale Yoder, Director and Dr, H, G, Heneman 
Assistant Director of the Industrial Relations Center 
and the following IRC Research Assistants: Mrs. 
Elizabeth Burr, Marvin Dunnette, Harland 
Thomas Mahoney. 


gratefully acknowl- 


Fox, and 


i i i id 
many items which were too neutral, i.e., d ; 
not reveal favorableness or unfavorablenes 
toward older people as employees. 


yì 5 awn 
A revised scale of 27 items was then draw 


up. Each was to be scored 0, 1, 2, 3, or 4 m 
the degree of agreement or disagreement, WI 


0 or 1 reflecting unfavorable attitudes and : 
or 4 reflecting favorable attitudes. The hight 
the total score the more favorable the attitude 
toward older employees. A total score of ; 
would thus be indicative of a neutral attitude- 


Application of Revised Scale 


The 27 item scale was then given to a plant- 
wide sample of 46 rank-and-file employees a™ 
16 supervisors and executives in a laundry- 

An item analysis of the top and bpon ig 
per cent of total scores for the rank-and-f 
sample showed 19 of the items to have » an 
values of 1.00 or greater. Only four uom 
seemed to be non-discriminating. Thus t 3 
revised scale was shown to meet the require 
ment of “internal consistency.” This coi 
clusion was also supported by a correcte é 
odd-even r of .90 for the same rank-and- 
sample. i 

Table 1 reveals that there is a statistical? 
significant difference between the mean ee 
tude scores of the rank-and-file employers ie 
representatives of management (“t reat 
= 2.57, probability less than .05). The” té 
score for management of 55.8 is approximé ale 
at the neutral point whereas the rank-ane 
employees are more favorably disposed tonis 
the employment of older workers- 


25 


Table 1 


vide 
„wI 
Mean and S.D. on 27-item Attitude Scale for Piet 
Sample of Rank-and-file Employees an! 
Representatives of Management 
p- 
Group N Mean R 
0 
Employers 46 65.1 08 
Management 16 55.8 


Altitudes Toward the Employment of Older People 


Table 2 


Mean and S.D. on 27-item Attitude Scale for Plant-wide 
Sample of Rank-and-file Employees 
Classified by Age 


Age Group N Mean S.D. 
Under 30 13 52.6 9.5 
30-49 16 65.7 11.7 
Over 50 17 74.0 a 

Total 46 65.1 13.0 


difference is 9.3 points which approximates 
the variability of the management sample 
(= 98) 7 
- One other point deserves mention. The 
Variability of attitudes among the rank-and-file 
'S appreciably larger (e = 13.0) than the 
Variability of the management sample (¢ 
=: 0.8). ; 
Table 2 presents the analysis of age differ- 
ences in attitudes of rank-and-file employees. 
he mean difference between each age group 
oo every other age group is statistica 
Significant, It is clear that attitudes oe 
ae Older employee are a function of age ap 
‘© younger the person the less favorable 5 
üS attitude toward the older worker. | “he 
Mean score for the “under 30” group, pa 
“ver, is only slightly below the neutral poin 
on the scale. Thus the younger worker, m 
average, is not antagnostic seen = 
er worker, But as rank-and-file emp/0y e 


a older they exhibit increasingly pa 
me v yorker. 
€ attitudes toward the older wor Ae 


wa of fact, the differences yen eee 
king, p ‘ample, the mean differ 
etw or example, 1 the “over 
59» CER the “under 30” group anc 


i i tch jg two or three 
times UP is 21.4 points which 1s two r 


š ective 
the magnitude of their resp 


s 
ie p ed deviations! This means e ae 
lee Overlapping in attitude scores 


Wo age groups. ; 
number E T dditional comparisons ef 
file g © Scores of sub-groups of the 'ifference 
Wag gP'Oyees were made. No sex s formal 
edy tsclosed, Amount of previous ne 
“tion was unrelated to attitude SC 


Attity 


fice 

Works Was no difference between ste 
tS an ion workers. + 

there d the production wor! e WD 


belong any difference between 


i ot 
ed to the union and those who did n 


155 


belong to the union. This last comparison 
reflects the fact that there was no difference 
between the office workers (non-union) and 
the unionized production workers. 

Additional comparisons did show differences 
but these appear to be tied to the age factor. 
Married workers had slightly higher scores 
than single workers. Years with the company 
and years on present job both showed striking 
mean differences in the expected direction. 

One additional comment seems warranted. 
The median age of the management group was 
44. If their attitudes reflected the age factor 
alone we would expect them to have a mean 
score of about 66 (see Table 2 for the “30-49” 
group). But their mean score approximates 
that of the “under 30” employee group. This 
suggests a “conflict situation” for management 
representatives. Being older than the rank- 
and-file would make them more favorable 
toward older workers. But recognizing that 
older workers may need to be shifted to other 
jobs where the pace is not as fast may force 
them to take a less favorable point of view. 


Cross-Validation of the Scale 


The 27-item attitude scale was administered 
to a random sample of production employees in 
a metal manufacturing company. The results 
are presented in Table 3. Although the 
number of such employees is very small, 
nevertheless, the age differences are the same 
as were found among the laundry employees 
and the mean differences are significant at the 
1 per cent level. 


1 In this laundry there was a segregation of older and 
younger employees. The latter were on jobs requiring 
“speed of movement” whereas the former were on 
“mending and repair jobs” where each could work at 
his own pace. 


Table 3 


Mean and S.D. on 27-item Attitude Scale for Random 
Sample of Production Employees in a Manufacturing 
Company Classified by Age 


Age Group N Mean S.D. 
Under 30 11 51.6 10.4 
30-49 12 61.1 94 
Over 50 4 69.3 12.4 

Total 27 58.4 121 


156 


Revised 24-item Scale 


On the basis of the three item analyses and 
the results of the 27-item scale secured to date, 
the Industrial Relations Center is now using a 
24-item scale entitled “Questionnaire About 
Problems of Older Employees.” It is anti- 
cipated that this new scale will facilitate the 
study of a variety of occupational adjustment 
problems of older persons. 


Summary 


This paper describes the development of a 
scale for measuring attitudes toward the 
employment of older people. The Likert 
method of summated ratings was used and a 
pool of 53 items was drawn up. Two’ item 
analyses using the Scale Value Difference 
technique resulted in a 27-item scale of high 
reliability (r = .90). 

One of the most significant findings to date 
has been the striking age difference disclosed by 
use of the scale. Rank-and-file employees 


Wayne Kirchner, Theodore Lindbom and Donald G. Paterson 


“under 30” are, on the average, neutral 
toward the employment of older workers. 
The “30 to 49” years group and the group 
“over 50” are increasingly favorable toward 
the older worker. 

As a result of a third item analysis a 24-item 
Scale is now being used in a variety of studies 
of the occupational adjustment problems of 
older workers. 


Received March 26, 1952. 
Early publication. 


References 


1. Bird, C. Social psychology. New York: Appleton- 
Century, 1940. Pp. 147-165. 

2. Edwards, A. L., and Kenney, Kathryn C. A come 
parison of the Thurstone and Likert techniques 
of attitude scale construction. J. appl. Psychol-» 
1946, 30, 72-83. i 

3. Likert, R. A technique for the measurement © 
attitudes. Arch. Psychol., N. Y., 1932, No. 140. 

4. Rundquist, E. A., and Sletto, R. F. Personality 
the depression; a study in the measurement © 
altitudes. Minneapolis: University of Minnesot# 
Press, 1936, 


Postwar Research in 


Pilot Selection and Classification * 


Abraham S. Levine and Ernest C. Tupes 


USAF Training Command, Human Resources Res 


earch Center, Personnel Research Laboratory, 


Lackland Air Force Base, San Antonio, Texas 


The extensive World War II research on the 
Aviation Cadet Classification Battery has teen. 
Teported in the Army Air Forces Psychology 
Program Research Reports (2, 3, 4). This 
article proposes to summarize major findings 
of postwar research in pilot selection and 
Classification in the United States Air Force 

, 

Throughout World War Ir 
Program of research on the pro 
selection and classification of aircrews WAS 
carried out by a staff of aviation psychologists 
and psychological assistants. At first, the 

ircrew Classification Battery Was used for 
classification purposes only. Preliminary selec- 
ion of men to be trained for aircrew was 
accomplished by the Army Air Forces Qualify- 
Mg Examination. As the Aircrew Classifica- 
me Battery was shown to be substantially 
Predictive of later success, it came to be used 
oh Selection of aircrew as well as for classifica- 
za without replacing the AAF Qualifying 
Wemnination. Minimum qualifying aA 
with changed from time to time m accorda’ 
or, the requirements of train} 
tah ations. The classificatio! 
tests developed from a collection © 
o S about which relative 
appa group of well-validatec, 
ar; ratus tests carefully selecte : 

ge number of devices available een rs 
World aed situation liffered TO 

aE ituation 1 

wens :1. During World War II the examiner 
in i typically graduate students with Aaea 
profa ology or education, but @ ter th 
no essionally trained airmen wer par 
in tainable. 2. The test pattery was 


an extensive 
blems of the 


February 1947 and stand es 
a x $10. 
b S Ty 1948 with new con gabet 1947, 


ased b 
£ on postwar cadets. Since 


hess se of the 
autho views expressed in this article are m official 
Views of 2d do not necessarily represent 


however, the aircrew battery has been admin- 
istered on an experimental basis only, because 
of the decreasing supply of applicants. 3. The 
World War II primary phase and part of the 
basic phase of the pilot training were combined ` 
into a single basic phase of pilot training. 

4. World War II first-phase pilot training was 
at civilian contract schools, but postwar first- 
phase training was conducted at Air Force 
bases. 5. After the war, the proportion of 
pilot trainees with previous flying experience 
increased greatly. 6. During World War II 

most eliminations were for flying deficiency. 
Since then, a large part of the eliminations 
have been for other reasons, with motivation 
assuming a prominent role. 

Table 1 indicates that despite all of these 
changes, the pilot stanine? for predicting 
graduation vs. total elimination for two 1950 
classes has a biserial validity of .57. This 
validity figure compares favorably with World 
War II validities (2). The validity of the 
pilot stanine for predicting eliminations due to 
flying deficiency alone is 60. Sincea relatively 
small percentage of cadets were eliminated for 
non-flying reasons during World War II, the 
flying deficiency criterion is perhaps most com- 
parable to the World War II criteria. The 
yalidities of the pilot stanine for motivational 
and administrative reasons for elimination are 
somewhatlower. Motivational eliminees com- 
prise those cadets who voluntarily resigned or 
who were eliminated because of “fear of 
flying.” The category of administrative elim- 
inees is made up mostly of those cadets 
eliminated for physical reasons . . . disable- 
ment during training, sickness, or physical 
defects not discovered in the qualifying 
examination. 

In general, the validity pattern of the 


1A stanine is a composite test score conve 
standard pate don ra ranges in one-half oe 
deviation units from lowest) to 9 (high x 
mean of 5. (highest) with a 


e United States Air Force- 157 


158 


Abraham S. Levine and Ernest C. Tupes 


Table 1 


Biserial Validities of the Pilot Stanine for Four Elimination Categories in Basic Pilot Training 
Note: Sample: Classes 50B-50C; Variable: Pilot Stanine; Criterion: Graduation vs. Elimination. 


Total Proportion 
N Graduated Est. r bis* 
Graduates vs. Flying Deficiency Eliminees 570 .67 .60 
Graduates vs. Motivational Eliminees 486 .19 .56 
Graduates vs. Administrative Eliminees 448 86 .49 
Graduates vs. All Eliminees 736 .52 Fd 


* Weighted medians of the biserial r’s computed for Classes 50B and 50C separately. 


various tests is found to be much the same as 
under the wartime conditions. General Infor- 
mation and Rudder Control tests, however, 
have greatly increased validity for predicting 
flying deficiency eliminations. The increased 
predictive efficiency of these tests may be 
accounted for in terms of the positive influence 
of previous flying experience on these test 
scores, the fact that a much greater percentage 
of the postwar population have had such 
experience than was true of the wartime 
population, and the good prognosis for men 
with previous flying experience. It is interest- 
ing to note in this connection that, of the 
non-test variables validated, previous flying 
experience is highly valid, while both age and 
education have essentially zero validities. 
Despite the substantial correlation of previous 
flying experience with successful completion of 
basic training, the contribution of this variable 
to the multiple R is small since this factor is 


being largely measured by the battery, partic- » 


ularly by the General Information and Rudder 
Control tests. 

The biserial validity of the pilot stanine for 
predicting the eliminations primarily due to 
motivational reasons is .56. Although this 
validity is high enough to have useful predictive 
value there is reason to suspect that the 
present aircrew battery would have to be 
reweighted, some of the tests would have to be 
rekeyed,” and it would have to be supplemented 


2 As used here, the term “reweighting” the tests refers 
to assigning new regression weights based on validities 
and beta weights for prediction of the different elimina- 
tion categories. The term “rekeying” refers to the 
development of new scoring keys based on item analysis 
against the criteria of graduation vs. each of the elimi- 
nation categories. The present paper discusses the 
results of rekeying certain tests but is not concerned 
with the possible reweighting of these tests on the asi 
of present validities and beta weights, fti 


by additional tests in order to predict motiva- 
tional eliminations as well as it predicts flying 
deficiency eliminations for successive samples. 
This hypothesis is partially based on findings 
for three 1949 classes (total N = 1596) in 
which the pilot stanine correlated .63 with 
the criterion for flying deficiency elimination 
but only .34 for the category of motivational 
eliminees (5). Thus, it seems that the pilot 
stanine possesses considerable stability for 
the prediction of flying deficiency eliminations 
even though the validities for motivational 
elimination may fluctuate over a considerable 
range for successive samples. The relative 
instability of the pilot stanine for prediction © 
motivational elimination may be attributable 
to: (1) differing proportions of motivational 
eliminees and the effect of shifting pg values 
on the stability of the biserial r, (2) differential 
composition of successive samples of motiva- 
tional eliminees (differing proportions of cadets 
who resign because of misgivings regarding 
their flying ability), and (3) to the fact that 
the Aircrew Classification Battery was deve 
oped „and weighted to predict a wartime 
criterion in which the eliminations were 2” 
most exclusively for flying deficiency reaso”S: 
Therefore, research designed to increase the 
validity and stability of prediction of motiva- 
tional eliminations should be directed towa” 
the development of more specific measures 
the motivational category and the investigatio” 
of the effect of changing conditions 0n pe 
proportion and composition of this criteri?” 
group. 

A second category of elimination due. 
non-flying deficiency is named administrati”? 
elimination. The validity of the pilot stam 
for this criterion is 49, This validity i: 
surprisingly high for a criterion apparent! 


Postwar Research in Pilot Selection and Classification 


heavily weighted with eliminations for physical 
Teasons. 
l Because of the increasing number of elimina- 
tions from basic pilot training for reasons other 
than flying deficiency, research was undertaken 
to determine whether rekeying some of the 
printed tests of the Aircrew Classification 
Battery against the criteria of graduation vs. 
Specific elimination categories would result in 
higher and more stable validities and whether 
measures of morale would have significant 
validity. Five tests were used in this study 
to predict each of the three arbitrarily defined 
elimination categories . . - flying deficiency, 
administrative, and motivational. Three of 
these tests are in the Aircrew Classification 
attery, and two are experimental tests which 
Were administered along with the battery. 
The three Aircrew Classification Battery 
tests studied were selected on theoretical 
8rounds as those most likely, 0n the basis a 
their content, to contain measures of motiva- 
tional factors. The General Information test 
AS two parts. Part One contains aviation 
formation items, and Part Two contains 
items requiring fairly specialized knowledge in 
Various fields; so successful answering of these 
items may indicate an above average om 
these fields. The Biographical Data Bian 
Contains items requesting background M e 
ion regarding the subject’s hobbies, spor S, 
Schools, and job histories, which a a So 
“ferential measures of interest. The cvs 
cal T udgment test involves verbal a 
ut it seems likely that the various wrong 


alternati : ich were each 
nat A which 5 
natives to each item, e merit as 


ned attractive, might have © "differences. 
ications of interest or personality cea 
bec Attitude Survey, Form AC, y slightly 
m two experimental tests- Tt 38 iginally 
odified form of an Attitude Survey re man 
veloped to obtain a measure of airmé 


in- 
isgevle: The Attitude Survey» though on 
fred during the first week of training, 


: : uaintance 
«Cadet has had more than brief Ee predictor 


military life, is the best sine tigated. 
Motivational elimination yet inves" d 
a jnistere 
unde Should he noted that the tests WaS ads given 
aSsura, XPerimental conditions wit he -affect their 
future C€ that the results would in no Y ay gestionnaire 
Ype ing The Attitude Survey, like other Te to faking 
So t arttuments, appears to be sS situa- 
tion wa tS Validity in an operationa Sed. 
Probably be somewhat lower" 


159 


Its biserial validity for predicting this category 
of elimination ranges around .45. It comprises 
such items as: “Do you think you would have 
good food and pleasant surroundings in most 
military hospitals?” (Alternative responses: 
“Yes,” “No,” or “I don’t know.”) This 
question, though posed before most of the 
cadets have had an opportunity to experience 
a military hospital, discriminates well between 
graduates and motivational eliminees. The 
item probably derives its validity from the 
fact that it is an expression of opinion rather 
than a reflection of experience. Interestingly 
enough, motivational eliminees, as a group, 
express less satisfaction with the quantity of 
food served in the mess hall, though probably, 
as a group, they have no bigger appetites 
than graduates of basic pilot training. Since 
the Attitude Survey can be validly admin- 
istered so early in training, it is conceivable 
that a slightly modified form (less susceptible 
to faking) may also be successfully given in 
the recruiting station. 

The other new test studied, the Biographical 
Inventory, is similar to the Biographical Data 
Blank already in the Aircrew Classification 
Battery. The main differences are that the 
Biographical Inventory is longer, it embraces 
more background areas, and, in general, it 
seeks somewhat more detailed information. 

Table 2 presents the multiple correlations 
between the five test variables used in this 
study and the various graduation vs. specific 
elimination criteria. Multiple correlations be- 


Table 2 


Multiple Correlational Data for Prediction of Three 
Elimination Categories in Basic Pilot 
Training Classes 50-B and 50-C 

Note: Variables:* General Information CESO5F: 
Practical Judgment CI301C; Biographical Data Blank 
CE602D; Biographical Inventory BE601FXAC; and 
Attitude Survey A/C. Criterion: Graduation vs. Elimi- 


nation. N = 430-583. 

R 
Graduates vs. Flying Deficiency Eliminees .62 
Graduates vs. Motivational Eliminees 61 
Graduates vs. Administrative Eliminees 51 


* These variables were specifically keyed to predict 
the respective elimination categories on Classes 49-C 
and 50-A, and then cross-validated on 50-B and 50-C 


160 


tween the five tests specifically keyed are 
slightly higher than the validity of the pilot 
stanine (based on the entire Aircrew Classifica- 
tion Battery consisting of 22 printed and 
psychomotor tests) for each elimination cat- 
egory. For flying deficiency eliminations the 
five tests correlate .62 with the criterion 
whereas the pilot stanine correlates .60. For 
administrative elimination, the correlations 
are .51 and .49, respectively, and for motiva- 
tional elimination, .61 and .56. Were the 
other tests entering into the pilot stanine 
combined with the five experimentally keyed 
tests, there is little doubt that even higher 
correlations would result. 


Conclusions 


The following major conclusions seem to be 
justified by the data of postwar research in 
pilot selection and classification: 

1. The pilot stanine has had a surprisingly 
stable validity over a period of nearly eight 
years and under a great variety of conditions. 
If anything, its present validity for basic 
pilot training is higher than ever before. 

2. The tests in the Aircrew Classification 
Battery are optimally keyed and weighted 
for the prediction of flying deficiency elimina- 
tions. However, the efficiency and stability 
of the pilot stanine for the prediction of other 
types of elimination might possibly be increased 
by rekeying several of the tests, particularly 


General Information, Biographical Data and 
Practical Judgment. : 


Abraham S. Levine and Ernest C. Tupes 


3. If some measure of morale, such as the 
Attitude Survey, were combined with the 
Aircrew Battery and the whole put into opera- 
tional use, the elimination rate from basic 
pilot training might be reduced somewhat. 

4. However, since there is evidence of 
considerable fluctuation in validities of the 
pilot stanine for prediction of motivational 
eliminations, the predictive effect of the new 
keys and tests should be investigated for @ 
number of successive classes before their use 
is definitely recommended. 


Received July 16, 1951. 


References 


1. Dailey, J. T., and Gragg, D. B. Postwar research 
on the classification of aircrew. Research Bulle- 
tin 49-2, Directorate of Personnel Research, 
Human Resources Research Center, Lackland 
Air Force Base, November, 1949. f 

- Dubois, P. H. The classification program. AMY 
Air Forces Aviation Psychology Program Re 
search Reports, No, 2. Washington, 1947. 

- Flanagan, J.C. The aviation psychology program 1 
the Army Air Forces. Army Air Forces Aviation 
Psychology Program Research Reports, No. 
Washington, 1948. r 

. Guilford, J. P. Printed classification tests. Army 
Air Forces Aviation Psychology Program Rp 
search Reports, No. 5. Washington, 1947. 

b Tupes, E. C., and Cox, J.A. Prediction of elimino” 
tion from basic pilot training for reasons other snett 
fying deficiency. Research Bulletin 51-1, DTEC 
torate of Personnel Research, Human Resource 


Research Center, Lackland Air Force Base; Fel 
Tuary, 1951. 


N 


w 


> 


an 


Readability of Advertising and Editorial Copy in Time and Newsweek * 


Kendall I. Trenchard 
Fordham University 


and 


W. J. E. Crissy 


Queens College and Fordham University 


t 
nea trade of advertising is now so near to 
impro 2n that it is not easy to propose any 
dler vement.” So wrote Ben Johnson 17 the 
today Few persons concerned with advertising 
ay would accept this proposition. Indeed, 


arad. i 
ertising research is on the increase and, 


Not in t 
ot in the foreseeable future, is the field apt to 
e further 


ec 
im ome so refined as to preclud 
provements, 


a ne aspect of printed advertising research, 
Siven impetus by the work of Flesch (1), 
dability of the 


1 . 
ide investigation of the rea 
inci While readability can be defined to 
it to € a miscellany of factors, Flesch limits 
is pro Reading Ease and Human Interest. 
ach ormulae provide a way of quantifying 
tin, f these factors for a given sample of copy- 
orm alee literature exists using the Flesc 
tions. ae on all kinds of written communica- 
comp; Hotchkiss and Paterson (2) hav e 
up piled a useful bibliography of this worl 
per; to 1950. The writers used Flesch’s 
aes definition of readability and em- 
eadi his formulae for determining 
ing Ease and Human Interest of samples 


advertisi «al copy With 
Whey cttising copy and of editorial COPY 
“hich j Sen se to compete. 


t, presumably, would hav 


i Methodology 
eas of Time and Newsweek ee a 
Y Deri y á 
1945_ i periods were studied—1 a peat 


a @ issu 
st . For each year, ten ? 
issue only selected. From each selected 
Copy, hie pages were randomly drawn. oa 
tr oth advertising and editorial, apP& 
_ disserta 


.* Thi 
ti is A ; 
ren entia Pet is based on Trenchard’s Mog Jability f 

en 


Tex 

~ Ext eds A Longitud dy 0 

in pind J ongitudinal Study t Trends 

Vor Mboria. dvertising Copy Including the Reham Uni- 
sity, 1987 Physical Characteristics- o 


on the selected pages comprised 

studied. Mean Reading Ease e — 
Interest scores for each type of copy for at 
magazine for each period were computed 
Significances of differences were determined 
by the usual two-tail test. 


Results 


Reading Ease. In Table 1 are repo 
: rted 
data on Reading Ease which Tana = EE 
significance. The following differences in 

average scores are statistically significant: 


ga id goes advertising copy 
aaa editorial copy 
pee ae and editorial copy 
pp ee? n editorial copy 


Probably the most surprising, even discon 
certing, trend found is toward horderto-read 
advertising copy in both magazines though 
only in Time is this trend significant z 
contrast to this, in the case of editorial 7 n 
the trend is toward easier-to-read cop pI 
this trend is significant in the case of ene = 
Before the war, Newsweek carried adveniaive 


“copy which was significantly easier to read 


than the editorial copy in jux aie 2 
This same significant pag a i. 
magazines in post-war period. ‘ 
To gauge the practical signi: 

trends and to interpret aes, g ue 
two additional sets of data are needed. gored 
regard to the descriptive style Gat ere 
these are the approximate educational : ies, 
alents according to Flesch (1): Stan nee 
to Sth grade; Fairly difficult, some ie 


162 


Kendall I. Trenchard and W. J. E. Crissy 


Table 1 


Pre-war and Post-war Differences in Reading Ease of Advertising and Editorial Copy in Time and Newsweek 


Time Newsweek 
i Average Descriptive Average Descriptive 
Score Style Score Style 
Pre-war 60.5 Standard 60.6 Standard 
Ad. Co gis 
Toston 53.0 Fairly difficult 58.7 Fairly difficult 
Pre-war 44.6 Difficult 43.8 Difficult 
Ed. Copy 
Post-war 46.7 Difficult 49.5 Fairly difficult 


school; Difficult, high school graduation and 
some college. The educational level of readers 
of both magazines was sought and kindly 
furnished from existing survey data.! In the 
case of Time, less than 10 per cent of the 
subscribers are below high school graduate 
level and 53 per cent are college graduates. 
In the case of Newsweek, 9 per cent of the 
family heads in homes to which subscriptions 
are addressed terminated their education with 
grade school, 33 per cent attended high school 
and 58 per cent attended college. 

With this dual frame of reference in mind, 
it is evident that all copy, both advertising and 
editorial, is within the comprehension of the 
vast bulk of the readership of both magazines. 
It might be conjectured that, ideally, the 
reading ease level for which to strive would 
be Standard if this could be done without 


1 Thanks are due Mr. Arthur Windett of Newsweek 
and Mr. Thomas E. Ryan of Time. 


seeming to write down to the readers with 
more education. 

Certainly, there appears to be nothing 
desirable in the trend toward more difficult- 
to-read advertising copy. On the other hand, 
from the standpoint of the advertiser, there 
would seem to be an evident advantage in 
the fact that advertising copy is easier tO 
understand than the editorial copy with which 
it competes. 

Human Interest. Table 2 contains the data 
on Human Interest. In the case of Time, 
none of the differences is statistically signif- 
icant. On the other hand, in Newsweek, before 
the war, advertising copy was significantly 
more interesting than the editorial copy with 
which it had to compete. This trend is 20 
evident in the post-war period. In generals 
Time would appear to edge Newsweek with 
regard to Human Interest in all of its COPY” 
The implication of this finding for both the 


Table 2 


Pre-war and Post-war Differences in Human Interest Value of 
in Time and Newsweek 


Advertising and Editorial Copy 


Time 
Newsweek 
Descriptive ipti 
Score Style Score a 
Pre-war 22.3 Interesti 
Ad Ge? ing 24.6 Interesting 
Post-war 24.4 Interesting 18.3 Mildly interesting 
Pre-war 24.4 Interesti i 
<n. ing 18.1 Mildly interesting 
Post- 28. i 
‘ost-war 8 Interesting 19.1 Mildly interesting 


—_— 


Readability of Copy in Time and Newsweek 163 


edi 7 à ý 
editors of Newsweek and advertisers therein 
1s obvious. 


Summary 


a An investigation was made of the readability 
ii -war and post-war advertising and 
2 oye copy in the two national newsweeklies, 
for R and Newsweek. The Flesch formulae 
used eading Ease and Human Interest were 
diffic 7 trend was found toward more 
Was 2 : advertising copy, though only in Time 
to this significant. A trend was found 
toward easier-to-read editorial copy, but only 


i 
n Newsweek was this significant. In both 


magazines, advertising copy was found to be 
easier to read than editorial copy. With 
regard to human interest, Time magazine was 
found to be significantly more interesting 
than Vewsweek both with regard to advertising 
copy and editorial copy. 


Received June 27, 1951. 


References 


1. Flesch, R. The art of readable writing. Ne k: 
Harper and Brothers, 1949. s paap 

2. Hotchkiss, S. N., and Paterson, D. G. Flesch reada- 
bility reading list. Personnel Psychol., 1950, 3 
327-344. jii 


Readability of Instructional Film Commentary 


William Allen 
San Diego State College, California 


Most experimental research with instruc- 
tional motion pictures has been purely com- 
parative in nature. The effectiveness of the 
film is compared with more conventional 
types of instruction or it is compared with 
other kinds of audio-visual materials. This 
type of research adds little to an understanding 
of which elements within the films are respons- 
ible for gains or losses in specific learnings. 

Only in recent years have investigators 
begun to study the specific elements within 
films which produce different responses in the 
learners. These recent studies deal principally 
with comparisons of different ways of produc- 
ing, organizing, and presenting the visual and 
auditory contents of instructional films. 

Also, in recent years, impetus has been given 
to the measurement of the readability of a great 
variety of written materials, with emphasis 
being placed on the application of the Flesch 
Readability Formulas (5). Although read- 
ability formulas were derived from printed 
verbal materials, Chall and Dial (2) give some 
experimental evidence that the formulas can 
be used to Predict the readability of oral 
verbal material. The present study contrib- 
utes further evidence that such is the case. 

This study investigates several pertinent 
problems relative to the measurement of the 
amaka Core Tee 
be used to measure ig ed seals 

: ificulty of film 
commentaries, or sound tracks? (b) If so. 
what effect will the grade level of the com- 
mentary have on the learning of factual 
material? (c) How do the Flesch (4), Dale- 


Chall (3), and Lorge (6) readability formulas 


differ in their prediction of amount of learning? 
(d) What effect will “human interest” factors 


in the commentary have upon the learning of 


ured by readability formulas, appear to 
contribute to factual learning? 


The Experimental Plan 


The experiment was designed to study the 
effect of four arbitrarily selected film coa 
mentary variations upon the learning © 
factual information by sixth grade students. 

Four commentaries were written for each 0 
the two Encyclopaedia Britannica Films, Th¢ 
Mosquito and The Water Cycle. The com- 
mentaries were written to specific word counts, 
as measured by the Flesch “Reading Eas¢ 
and “Human Interest” Formulas, at a 
following two levels of difficulty and two ee 
of human interest: fifth grade-very interesting 
(5-VI), fifth grade-dull (5-D), seventh grade 
very interesting (7-VI), and seventh grade-du 
(7-D). They contained the same essent® 
information for each film and were recorded by 
the same voice on a tape recording. The films 
were shown to the experimental groups P 
running the motion picture projector silenty 
and synchronizing the tape recording with R 

The experimental groups included 668 9* 
grade students (all of the 22 sixth gt 
classes) in four elementary schools in A 
Bernardino County, California, The childre 
were in the eighth month of the sixth grade- ð 

The test for The Mosquito consisted io 
eighteen multiple-choice and three complet! 
questions, totaling twenty-eight points; that ce 
The Water Cycle had sixteen multiple-cho"” 
and two completion questions, totaling ten A 
one points. Test items were eliminated vae 
they, by similar wording or other charactet! at 
favored one or several of the commentaries 5, 
the expense of one or more of the other ©° st 
mentaries. As measured by the testt° e 


F technique with thi trol groups. 
FA a oe oe ‘ami Mosquito test had a catia of veliabilit g 
= » Measurable by .68, and bo 
readability formulas, contribute to factual Sanni tee iad test, .64, 
learning? (f) What other factors, not meas- The experimental eia used was one wie 
164 


Readability of Instructional Film Commentary 


analyzed the results from a series of duplicated 
experiments, each of which was conducted in 
an individual school with relatively homoge- 
neous classes. In order to satisfy the require- 
ments of the design, the pupils were given a 
Pre-test on their knowledge of the subject 
Matter being tested. On the basis of the 
Pre-test score, level of intelligence, and reading 
ability, the classes were equalized within each 
School and randomly assigned to different 
methods. Two weeks following the pre-test, 
Rie „Classes were shown the experimental 
Versions of films (with the exception of the 
control groups, which did not see the films). 
ne day later the test was re-administered to 
€ experimental and control groups, and the 
Bains over the pre-test scores were computed. 
onp Was followed by the second film and test 
nthe following days, using the same procedure. 
Tee statistical measures were used: (1) 

e results were analyzed by means of an 
Nalysis of variance, which provided an 


“timate of errors due to randomized vara 
€r than the experimental variable. a 


kim terion scores, used as measures Of 1 
ep 8 resulting from the film vers, fe 
ae by the “per cent of gain O : 
Ben for improvement.” (3) In order h 
specif, ine the influence upon learmng x 
tech ihe factors in the film commentaries, x 
Mique of correlation analysis was tn 
exce mmentaries were analyzed to fin 
necessa in each that contained oe 
the ¢ ary to answer each of the ques 
ests; (b) separate readability word CO 
made of each of the excerpted ann 
useg © Standard scores of the word counts N 
with a S tallies in computing the correla 
© gains on the tests. 


Results 


€sults of the experiment appl 


T icable to 
ms are : 


b The 
oth fìl 


1. ‘ can 
be a aPpears that readability formulas 


À : instruc- 
tional to measure the difficulty at ee i- 
tions... commentaries. Under te BT 
at Wh: this experiment, the level of r aha 
a ak the oral commentary Was writ E 
fact, Urable effect upon the Jeanni iat 


content of the film. Table 1 sho 


165 


the fifth grade commentaries resulted in 
significantly greater learning of the factual 
material than did the seventh grade commen- 
taries. This finding was consistent for both 
films, although the differences were greater for 
The Mosquito (10%) over The Water Cycle 
(6%). The fifth grade commentaries showed 
a mean gain of 83% for both films. All the 
film versions resulted in significant learning 
gains over the control groups, the control 
groups gaining only 3% on The Mosquito and 
5% on The Water Cycle in comparison with 
gains ranging from 25% to 47% on The 
Mosquito and 17% to 43% on The Water Cycle 
in the case of the experimental groups. 

2. All three readability formulas correlated 
significantly and to about the same degree 
with the amount of improvement on the tests. 
Table 2 shows these correlations. However, 
when the reading grade placements of the 
complete commentaries were measured by the 
formulas, it will be noted from Table 3 that the 
Lorge formula underestimated the grade-level 
placement of the commentaries by about two 
grades when compared with predictions of the 
Flesch (R.E.) formula, and that the Dale-Chall 
formula overestimated the grade in one case 
and underestimated in the other. 

3, The one factor common to all three 
readability formulas, average sentence length 


Table 1 


Differences in Mean Scores Among Per Cent of 
Improvement for All Film Versions 


Film Versions 5-VI 5-D 7-VI 7-D 
e Mosquito: 

k 5-VI 7.31§ 10.39} 17.01* 
5-D 3.08 9.70} 
7-VI 6.62§ 
7-D 

The Water Cycle: 

5-VI — 6.64 1.73 4.34 
5-D 8.37§$ 10.98} 
7-VI 2.61 
7-D 


* Significant at the 0.1% level. 
+ Significant at the 1% level. 
t Significant at the 2% level. 
§ Significant at the 5% level. 


166 


Table 2 


Coefficients of Correlation Between Per Cent of Gain and Specific Factors 


William Allen 


The Mosquito The Water Cycle 


Specific Factor Number r Number r 

Readability Formulas: 

Flesch (R.E.) 84 439%" 72 502** 

Dale-Chall 84 553% 72 421" 

Lorge 84 456** 72 ae 
Average Sentence Length 84 .614** 72 .432** 
Length of Passage 84 .072 72 076 
Human Interest Factors: 

Flesch (H.I.) Formula 84 .306** 72 — .094 

Combined Flesch (R.E.) and (H.L.)f 84 ty ai — = 

Per Cent of Imperative Sentences 52 .266* 60 —.127 

Per Cent of Questions 76 .257* 64 = 174 

Per Cent of Personal Words 84 .292** pes = 

Per Cent of Personal Sentences 80 .247* A m 


* Significant at the 5% level. 
** Significant at the 1% level. 


t Coefficient of multiple correlation among per cent of gain, Flesch (R.E.) formula and Flesch ŒI) formula: 


of the material, had the highest correlation 
(.61) of any elemental factor. 

4. The length of the passage in words had 
no measurable effect upon learning from the 
films. 

The results of the experiment applicable to 
only one of the two films are: 


1. The incorporation of such human interest 
or personal reference factors as “personal words” 


and “personal sentences” in a film that lends 
itself to such humanizing resulted in greate 
informational learning. Only in The Mosquil? 
film did the human interest factors exert op 
influence, contributing about 7% to the at 
ing. In The Water Cycle, their influence wa 
insignificant. This factor did not contribu 
as much to factual learning as did the reading 
ease factor. 


Table 3 
Readability Characteristics of the Complete Commentaries 
Film ) 
and Flesch (R.E) Dale-Chall lesch LT 
i : -Cha lesch ( 
Version (grade) (grade) ER i score) 
The Mosquito: 
svi 5.5 5.5 3.5 709" 
i 5.5 7.8 4.0 32t 
iw 73 8.6 5.1 61A 
$ 7.5 10.0 5.5 a 
The Water Cycle: 
ox 5.4 41 3.6 gaat 
A 5.5 4.6 4.0 is 
ry 11 69 49 mat 
a 7.6 7.2 5.1 6.08 


* Flesch (Human Interest) score rated as “dramatic.” 


+ Flesch (Human Interest) score rated as “very 
t Flesch (Human Interest) score rated as “dull,” 


interesting.” 


| 


Readability of Instructional Film Commentary 


2. The creation of a “pattern” or “outline” 
which enumerated facts or concepts to be 
learned resulted in significantly greater learning 
than when such devices were not employed. 
Three test questions were not used in the 
Principal experimental analysis because the 
fifth grade commentaries, in each case, enu- 
merated a series of facts in a logical order, 
Whereas the seventh grade commentaries did 
not. It was noted that the gains for these 
patterned commentaries were extremely high, 
in one case 76% of the room for improvement. 

hey were, therefore, submitted to further 
Statistical analysis, which revealed that the 
Correlation between the patterned elements 
With the influence of the repetition and Flesch 
R.E.) factors ruled out was extremely high 
81). And a most interesting finding is that 
the factor of readability as measured by the 
Flesch (R.E.) formula, apparently had no 
influence, The mean gain for the patterned 
over the unpatterned commentaries was 28%, 
an, appreciable amount when compared with a 
Bain of about 6% for the same commentary 
Versions in the remaining test questions of 

he Water Cycle film. This finding cannot be 
Considered conclusive because only three test 
Westions were involved; but this problem 
Should be submitted to further experimentation. 


Discussion 


Several of the results deserve 
Scussion ; 


“rst i i * 
» IN connec vith the € 
řeadabili nnection wI 


further 


lifferences in 


for- 
Mula, lity prediction among the ppe Tè 
com © it should be remembered oe 
for aj ons were made with the Flesc® ( aut 
not a © and his human interest factors 


C : that 
boty "Sidered. Tt is interesting tO i 


ay 

en to account for these hu 
import (Table 3), These facts raise 
lng tes question: What is the 
for, tee level of the material i 
anq iffer, in some cases, 28 ™ 

heege ?half grades. This is a problem t 5 
Visuale 1 ther study, as does the influence 0 ality 
love, 5 in the film itself upon the readabi ie 
Meas, S the readability score an pl 
be Vien, of the grade level at which the am ere 
wed with understanding, oF 0° i 


167 


factors within the film’s visual elements that 
affect this grade level placement? In compar- 
ing the three formulas, it should be noted that 
the Flesch formulas are much easier to apply, 
taking only half as long as the other two. 
They require only simple word counts and a 
few mathematical computations in comparison 
with the more time-consuming and tedious 
use of word lists necessary with the Dale- 
Chall and Lorge formulas. 

Second, particular attention was paid to the 
influence of the human interest factor, percent- 
age of imperative sentences, because Zucker- 
man (7) found that imperative statements 
were more effective in promoting learning 
than passive types of statements. It was 
found, however, that imperative sentences were 
no more effective than the other human interest 
factors. This problem deserves further study. 

Third, human interest factors contributed 
to learning when the film lent itself to humaniz- 
ing, as in the case of The Mosquito. In this 
film, the mosquito was given a name, “Skeeter,” 
was referred to as “her” and shown in relation- 
ship to “her mother.” The life cycle concept 
appeared to be more susceptible to this type of 
humanizing than did the water cycle concept. 
It is possible, however, that The Water Cycle 
commentary could have been written differ- 
ently in order to humanize and personalize 
the water cycle concept, but such was not done, 
Where The Mosquito had frequent references 
to “Skeeter,” to “her,” and to “her mother,” 
The Water Cycle had practically no third 
person pronouns of natural gender. Almost 
all “personal words” making up the Human 
Interest word count were first and second 
person pronouns. This fact may have tended 
to call attention of the “you” and the “we” to 
the content of the film, but may not have 
p ersonally involved the viewer in the content of 
the film. This is an important problem that 
deserves further study. The questions can be 
asked: What might have been the effect upon 
learning had the water cycle concept been 
dramatized and given some human character- 
istics? Or is the nature of the subject matter 
in The Water Cycle such that it cannot be 
matized and made more interesting? The 
findings indicate that the factor of human 
interest is not clearly defined and that possibly 
the Flesch (H.1.) formula over-simplifies it, 


dra. 


168 


Fourth, a number of other questions are 
raised by this study: What effect upon learning 
has the enumerating of a series of facts in 
outline form? What kind of material lends 
itself to such listing? Is the effect of the 
outlining so great that the readability level of 
the material is not important? What is the 
relationship between the enumerating of the 
facts and the repetition of the facts? Do 
questions in the commentary lead to increased 
learning? If so, what kinds of questions, and 
where should they be placed in relation to the 
film content? Are there some kinds of ques- 
tions that will decrease learning from the film? 
Should the questions asked be answered in 
the film, or should they remain unanswered 
in order to elicit further study by the student? 
It was noted that both the Flesch (R.E.) and 
Dale-Chall formulas count every occurrence of 
difficult words; whereas the Lorge formula 
counts the difficult word only the first time it 
appears. What effect do these practices 
have upon the “true” readability level of the 
commentary, if it can be determined? If a 
vocabulary review can be held before the 
film showing and the difficult words learned, 
to what extent is the readability level of the 
film lowered? 


Summary 


1. The level of readability at which the oral 
commentaries of factual instructional films 
were written had a measurable effect upon the 
learning of the factual content of the film. 
Commentary written one grade level below 
the present grade level of the pupils resulted in 
significantly greater learning than did com- 
mentary written one grade level above. 

2. The Flesch, Dale-Chall, and Lorge Read- 
ability Formulas were about equal in predicting 
the comparative readability of film commen- 
taries when several were measured. However, 


William Allen 


the Lorge Formula consistently predicted a 
reading level approximately two years lower 
than the Flesch Reading Ease Formula, and 
the Dale-Chall Formula varied inits prediction. 

3. The incorporation of such human interest 
factors as questions, imperatives, and personal 
pronouns in a film that lends itself to such 
humanizing resulted in greater learning. 

4. It can be tentatively concluded that the 
creation of an “outline” or the enumeration of 
facts to be learned resulted in much greater 
learning than when such a procedure was not 
used. : 

5. The length of the passage had no measur- 
able effect upon learning from the films. , 

6. The findings from the study have applica- 
tion by both film producers who are creating 
films and audio-visual educators who are 
selecting films for use at the various grade 
levels (1). 


Received June 27, 1951. 


References 


1. Allen, W. H. A scientific method for evaluating 
films. Nations Schs., 1951, 47, 76-78. 

2. Chall, Jeanne S., and Dial, H. E. Predicting listene" 
understanding and interest in newscasts. Zane. 
Res. Bull., Ohio St. Univ., 1948, 27, 141-153. 

3. Dale, E., and Chall, Jeanne S. A formula for prê- 
dicting readability. Educ, Res, Bull., Ohio St. 
Univ., 1948, 27, 11-20 and 37-54. 

4. Flesch, R. A new readability yardstick. J. appl- 
Psychol., 1948, 32, 221-233. 

5. Hotchkiss, S. N., and Paterson, D, G. Flesch reada 
bility reading list, Personnel Psychol., 1950, 3, 
327-344. 

6. Lorge, I. Predicting readability. Teach. Coll. Ret» 
1944, 45, 404-419, í 

7. Zuckerman, J. V. Commentary variations: level P 
verbalization, personal reference and phase re!® 
tions in instructional films on perceptual-mol? 
tasks. Technical Report SDC 269-74, Instr" 
tional Film Research Program, The Pennsylvan® 
State College, December 15, 1949. 


Achievement of Freshman Engineering Students and the 
Strong Vocational Interest Blank 


S. D. 


Melville 


Princeton University 


and 


Norman Frederiksen 


Princeton 


T 

the he purpose of this study was to determine 
Year ia between measures of first- 
trong Vo mic achievement and scores on the 
S group 9 orn Interest Blank for Men for 
Princeto freshman engineering students at 
's Widely University. Since the Strong test 
Seemed y used in educational counseling, it 
to acad esirable to investigate its relationship 
Althoy “ie achievement in engineering, 
to be n the test was not originally intended 
Seemeq Predictor of academic success. 

Particularly important to determine to 


What 

d r 

egree scores made on the Strong test 
nt 


Te 
Measure Eed to an adjusted achieveme 
fitich th ie, a measure of the amount by 
(or fell bel obtained average grade exceede 
. Student Ww) the average grade predicted for 
mesat: the basis of measures of ability- 
etachien: is often referred to as & measure 
evement” or “underachievement. 


Procedure 


Pri ent was made to obtain interest test 
th eeton ja students of the class entering 
pe first o SePtember of 1947, who complete 
a i Md ear of their work in the School 0 
Wwe Stes ng, and for whom cert 
oh exc] Were available. Foreign? st 
the banded: Of the 104 wl 
4 of these criteria, Strong 
e ank scores were obtained 
s for al] he curriculum is essentia 
®erin, first-year students in the 9° 
inist at Princeton. 
for potion of the Strong plank 
st of the students in the Pr 
Se st Udents were tested m the 
wines were tested by Mr. Batty Fagin ed 
apoi; r yalifferent study which he wake some 
a available s kindly consente! 


An 
Scores 


Utere 


Sty St 


ng 
1) of 


University and Educational Testing Service 


1949; and 7 students took the test at various 
other times between the fall of 1947 and the 
fall of 1949. Other investigators, including 
Burnham (3), Glass (5), and Strong (7, p. 277 
and Ch. 15; 8), have pointed out that changes 
in Strong scores occur over a period of time. 
Evidence concerning the amount and direction 
of change appears to be inconclusive. There- 
fore, it is difficult to estimate what effect 
administering the blank at various times over 
a two-year period may have had on the 
results obtained. 

Three measures were used in this study: 
rd scores on the Strong test, (2) 
ge grade, and (3) adjusted 
average grade. Scores on all of the occupa- 
tional scales plus the interest maturity, 
occupational level, and masculinity-femininity 
scales were obtained. The freshman average 
rades were obtained by averaging the grades 
in all courses taken during the freshman year 
for each student. 

‘Adjusted average grade is the adjusted 
achievement measure employed in the study; 
it is the difference between each student’s 
er edicted freshman average grade and the 
average grade which he actually obtained for 
si freshman year. The predicted freshman 
vege grade was based upon a multiple 
TEET ession equation involving the College 
Entrance Examination Board test in science 
(che mistry, physics, or the average of both) 
and the Converted School Grade. The Con- 
verted School Grade isa prediction of freshman 
average grade determined for each student by 
the admissions office of the University. It is 
pased upon the student’s rank in his secondary 
school graduating class; a correction is intro- 
Jted which takes into consideration the 
Scholastic records attained at Princeton by 


(1) standa 
freshman avera 


169 


170 


previous students from his particular secondary 
school. In an earlier study it was found 
that these two measures (College Entrance 
Examination Board science test score and 
Convertecd Shool Grade) provided the most 
efficient combination for the prediction of 
freshman average grade in the engineering 
curriculum at Princeton. 


Results 


In Table 1 are presented the intercorrelations 
of the measures used in computing the pre- 
dicted grades. Combining the Converted 
School Grade and the College Board science 
scores gave a multiple correlation coefficient of 
.62 with freshman average grade. The beta 
weights were .3787 for Converted School 
Grade and .4053 for College Board science score. 

The correlations of each scale on the Strong 
test with (1) freshman average grade and (2) 
adjusted average grade were computed. The 
results obtained are presented in Table 2. 

Eight of the means on the occupational 
scales were above 35. These eight means were 
for aviator (41.6), production manager (41.2), 
scientific farmer (38.3), engineer (37.3), printer 
(36.8), real estate salesman (36.3), personnel 
director (35.6), and mathematics-physical 
science teacher (35.4), While these are mainly 
occupations stressing technical activities, sales 
and administration are also represented. 

Mean scores of less than 25 were found for 
nine of the occupational scales. These means 
were for banker (24.9), C.P.A. (23.7), dentist 


Table 1 


Intercorrelations of Ability and Achievement 
Measures (N = 93) 


Converted College Freshman 
School Board Average 
Grade Science Grade 
Converted School 
Grade 25 48 
College Board 
Science 25 50 
Freshman Average 
Grade 48 50 
Mean 3.2 589 2.9 
S.D. at 97 8 


S. D. Melville and Norman Frederiksen 


(23.6), artist (21.2), Y.M.C.A. secretary (20.8), 
mathematician (20.7), psychologist (19.8), city 
school superintendent (17.0), and minister 
(14.8). As might be expected, the types of 
occupation included in these scales are more 
varied than in the group of scales having 
relatively high means. Occupations — 
welfare, biological science, and business deta! 
are represented, as well as artist and math- 
ematician. The mean scores on the three, 
additional scales (interest maturity, occupa- 
tional level, masculinity-femininity) were al 
somewhat above 50. A 
The standard deviations for the oe 
occupational scales ranged from 6.9 to 15.9- 
There was little, if any, relationship between 
the standard deviations and correlations with 
freshman average grade. There was sor 
tendency, however, for high correlations "a 
adjusted average grade to be associated wi 
high standard deviations of the occupation? 
scales. The rank order correlation betwee! 
these two variables, (ignoring the signs of the 
correlations) was .33. This correlation, ROW- 
ever, is not significantly different from 2¢T?: 
The correlations between interest scales 4" 
measures of achievement are shown in the 
last two columns of Table 2. With an N ° 
93, the standard error of a product momen" 
correlation would be 0.104 if the true correla- 
tion were zero. Thus a correlation would er 
to be as high as .20 to be significant at ie 
five per cent level, or .27 to be significant att F 
one per cent level. Considering that pae 
are 42 Strong scales involved, on the basis af 
chance alone we would expect about tw? at 
the correlations to be significant at the five p ; 
cent level and none at the one per cent lev! P 
Actually seven of the correlations betwe? 
Strong scales and freshman average or i 
were significant at the five per cent 
one was significant at the one per cent ie 
The significance of the correlations betwee” t 
Strong scales and adjusted average grade she 
not computed, since the basic assumption’ F 
the conventional statistic for the stand? 
error of a correlation were not fulfilled bY the 
adjusted average grade which involves 
difference between two measures, e 
The correlations between freshman avery 
grade and the Strong scales which are sign 
icant at least at the five per cent level +“ 


Achievement of Students and Vocational Interest Blank 


171 


j Table 2 
Means and Standard Deviations of Strong Scales and Correlations with Achievement Measures (N = 93) 
Correlation Correlation 
with Freshman with Adjusted 
Group Scale Mean S.D. Average Grade Average Gradet 

. 0.6 .06 14 

Artist 21.2 1 fe 
Psychologist 19.8 123 32 os 
Architect 25.7 11.8 07 12 
I ae 27.0 10.6 15 22 
ate aa 28.2 9.9 04 17 
copa 10.6 .08 -16 

Dentist 23.6 

= 20.7 10.3 .22* 124 
Mathematician 29 3 15.5 .20* -24 
H Physicist 373 12.1 abd Ad 
Engineer 33.6 141 .26* .29 
Chemist .00 —.04 


ur 2 $ 
/ Production Manager al. 96 10 —.04 
38.3 8 
06 04 


Iv 


Y.M.C.A, Phys. Director 35.6 12.4 i i 
Personnel Director 34.8 10.0 ‘06 i 
y Public Administrator 20.8 11.6 ‘05 ‘05 
Y.M.C.A. Secretary 258 12.4 ‘17 16 
Soc. Sci. H. S. Teacher 17.0 11.3 16 6 
City School Supt. 148 12.9 
y Minister 11.1 1 18 
x 26.9 18 09 
VI Musician Bi 8.3 a = 
| C.P.A. 704 9.7 On at 
y Accountant 33.0 bs —.03 —.23 
Ty Office Man 33.4 34 —.22* —.28 
Purchasing Agent 24.9 93 —.24* —.25 
Banker 27.6 Zos Zig 
Mortician 32.5 m —.25* —.31 
x Sales Manager 36.3 12.6 —.16 — 28 
Real Estate Salesman 27.4 77 —.02 —.01 
Life Insurance Salesman 31.2 3.0 .09 -00 
X Advertising Man 27.3 69 -01 08 
Lawyer 27.6 z =i —.12 
XT Author-Journalist 33.5 is 12 15 
; 58 is 4 
President Mfg. Concern 319 52 i —.05 
Tnterest-Maturity 52.9 7.7 q ~m 
Occupat. Level 54.3 
Š Masculinity-Femin. oe 
Sip,: fidence iusted average grade was not j 
$ co! djusted a ge g ot com. 
j + Sign Cant at the five per cent level af confidences ong scales jane aad error of a correlation were not 
ti ant at r cen veen tistic yi wo measures. 
agal h sient the one pe RS betwen ional tat nce between t 
illeg ince Icance of the corre es the oY 


Farmer 

Aviator 

Carpenter 

Printer 

Math. Phys. Sci. Teacher 


O01 
14 
25 
— .04 
07 


n 
of the coneryol 


172 


jndicate that academic achievement in the 
engineering curriculum for the first year at 
Princeton is most closely related in a positive 
manner with scores on the Strong scales for 
psychologist (.32), chemist (.26), mathematics- 
physical science teacher (.24), mathematician 
(.22), and physicist (.20). Apparently, then, 
grades tend to vary directly with interests in 
activities associated with men in scientific 
occupations. 

Significant negative relationships exist be- 
tween academic achievement and Strong scores 
for real estate salesman (—.25), mortician 
(—.24), and banker (—.22). Evidently the 
more successful engineering students tended 
to look with disfavor on activities commonly 
preferred by men in business occupations such 
as these. 

Correlations tend to be somewhat higher with 
adjusted average grade than with the unad- 
justed grades, as might be expected when 
ability is held constant. The positive correla- 
tions which are .20 or higher are between 
adjusted average grade and scores on the 
Strong scales for psychologist (.37), chemist 
(.29), minister (.26), mathematics-physical 
science teacher (.25), mathematician (24), 
physicist (.24), and physician (.22). Thus 
those whose achievement exceeds their predic- 
tion tend to have interests which are character- 
istic of men in scientific occupations. Surpris- 
ingly enough, the scale for minister appears 
in the list; perhaps some characteristic such as 
conscientiousness is common to ministers and 
“overachievers” in engineering. 

A negative relationship exists between 
adjusted average grade and Strong scores for 
real estate salesman (—.31), banker (—.28) 
mortician (—.25), purchasing agent (—.23), 
and life insurance agent (—.20), There 
appears to be a tendency for those whose 
achievement falls short of their prediction to 
prefer activities associated with occupations 
stressing business detail and sales, 

None of the correlations between the three 
additional Strong scales (interest maturity, 
occupational level, masculinity-femininity) and 
freshman average grade or adjusted average 
grade was as high as .20. 

The correlations obtained must be inter- 
preted with caution. They were obtained 
from a very restricted group: students who 


S. D. Melville and Norman Frederiksen . 


entered Princeton in September of 1947, who 
completed the first year of their work in the 
School of Engineering, and for whom all data 
were available. Since these students were 
admitted on the basis of a composite criterion 
involving high school record, interviews, test 
scores, self-selection, and other unidentifiable 
factors, many of which were highly sub jective, 
and since these factors were never dealt with 
in such a manner as to yield a score, it 15 
impossible to estimate either the direction oT 
the amount of distortion produced by basing 
our statistics on this curtailed group. 
Examination of comparable data (means, 
standard deviations and correlations wit 
freshman average grades) obtained by other 
investigators, including Berdie (1, 2), Burnham 
(3), Coblentz (4), Glass (5), and Johnson (6): 
revealed findings similar to those obtained 1" 
this study. None of these investigators 
tempted to hold ability constant, however 
What are the implications of these findings 
for educational counseling? It appears a 
the use of the Strong Vocational Interest Blan 
in advising students regarding probable on 
demic success in the engineering curriculum : 
justified to a limited extent. The magnitude 
of the correlations with freshman average grade 
and adjusted average grade suggests that care 
should be taken not to give too much weigh 
to the Strong scores. The Strong scores shou i 
of course be evaluated in relation to othe! 
pertinent vocational guidance data. No eY" 


at- 
dence is provided by this study concerning pe 
value of the test for predicting future J° 
satisfaction, the purpose for which the test MG 
designed. e 
No attempt was made to include any of e 
Strong scales in a multiple regression equate 
for prediction purposes. Considering ! 
number of scales included in the blank alt 
the size of the group being tested, it was or 


that such a procedure would result in und?” 
co’ 


capitalization on chance. Furthermore, ‘lat 
il 


parable data were not available on a si™ ne 
group of students as a means of cross-validal!? 


Summary 


«ate 

The purpose of the study was to investi6 f 

relationships existing between measures n 
first-year academic achievement and score? 


Achievement of Students and Vocational Interest Blank 


the Strong Vocational Interest Blank for Men 
for a group of Princeton engineering students. 
Complete data were obtained on 93 out of the 
104 eligible students in the class entering 
Princeton in September of 1947. 

The measures used included: (1) scores on 
all of the scales of the Strong Vocational 
Interest Blank for Men; (2) the average of 
grades obtained in all courses taken during the 
oe year (freshman average grade); 
call ie an adjusted achievement measure, 
is ed adjusted average grade. The last measure 

the difference between the predicted average 
ten and the obtained average grade. The 

edicted average grade was based on a multi- 
Ple regression equation involving the scores on 
amie Entrance Examination Board test n 
ce and a measure based onsecondary schoo 
in ievement. The procedure involved obtain- 
hoes correlation of each scale on the Strong 
age vith: (1) freshman average grade, and (2) 

Justed average grade. 
tively ‘il i engineering stu 
With’ o igh interest in activitie 

usin Ccupations stressing scienti 
te — sales and administration. ; ss 
ciated. ay low in interest for activities ass 
cien with men in certain welfare, biologica. 
ce, and business detail occupations. 
stuga e Was a wide range in variability of = 
marked: scores on the various scales. Si 
tine relationship was evident heimen ie 
Staty, 9f the variability and any of the o 
istics obtained. 
TA of the correlations between rehim 
siig grade and the Strong ae Tai 
The Cant at the five per cent level g 
Brow, Suggest that academic success = > 
Was directly related to interes 


udent had rela- 
ties associated 
fic work and 
He was 


173 


activities associated with men in scientific 
occupations, and inversely related to interest 
in activities associated with men in occupations 
stressing business detail and sales. 

Twelve of the correlations between adjusted 
average grade and the Strong scales were as 
high as .20 or higher. Holding ability constant 
tended to emphasize the relationships noted 
between freshman average grade and the 
Strong scores. 

None of the correlations between the three 
additional Strong scales (interest maturity, 
occupational level, masculinity-femininity) and 
freshman average grade was significant at the 
five per cent level. 


Received June 9, 1951. 


Reference 


1. Berdie, R. F. Factors associated with vocational 
interests. J. educ. Psychol., 1943, 34, 257-277. 

2. Berdie, R. F. The prediction of college achievement 

and satisfactions. J. appl. Psychol., 1944, 28, 

239-245. 

3. Burnham, P. Stability of interests. Sch. and Soc., 
1942, 55, 332-335. 

4. Coblentz, I. Prognosis of freshman academic achieve- 
ment at the Pennsylvania State College. Unpub- 
lished Doctor’s thesis, Penn. State College, 1942. 

5. Glass, C. F. An investigational analysis of certain 
general and specific interests of engineering stu- 
dents. Unpublished Master’s thesis, Purdue 
Univ., 1934. 

6. Johnson, A. P. The prediction of scholastic achieve- 
ment for freshman engineering students at Purdue 
University: studies in engineering education, II. 
Lafayette, Ind.: Purdue Univ., Div. Educ. Re- 
search, 1942. 

7. Strong, Ei, Bay, Ji Vocational interests of men and 
women. Stanford Univ., Calif.: Stanford Univ. 
Press, 1943. 

. Strong, E. K., Jr. Permanence of interest scores 
over 22years. J. appl. Psychol., 1951, 35, 89-91. 


Vocational Interests of Industrial Relations Personnel * 


Philip H. Kriedt, C. Harold Stone, and Donald G. Paterson 


Industrial Relations Center, 


Since the period in which the present 
Personnel Director key of the Strong Voca- 
tional Interest Blank was developed (1928-38), 
the field of industrial relations and personnel 
management in business and industry has seen 
many changes. Among the most important of 
these has been the increasing levels of training 
and skill demanded of manpower managers 
as the field has begun to achieve a more 
professional status (1, 2, 3, 4, 9, 10, 14). 
There has been a trend toward greater special- 
ization in job duties, the professional literature 
and professional organizations have increased 
in number, and many universities and colleges 
have established special curricula for the 
training of professional industrial relations 
personnel. In view of these developments, the 
Minnesota Industrial Relations Center has 
undertaken a series of studies to determine the 
characteristics of personnel in this field, the 
nature of their duties, and the growth of a 
specialized terminology (5, 6, 7, 8, 15, 16, 17). 

The present study is a part of this series and 
was aimed at ascertaining if any changes in 
the measured vocational interests of industrial 
relations personnel have occurred since stand- 
ardization of the Personnel Director Key in 
Strong’s Vocational Interest Blank. The 
specific purposes of this investigation were: 
(1) to determine which industrial relations 
positions, if any, are adequately measured by 
the present Personnel Director Key of the 
Strong Vocational Interest Blank; (2) to 
develop a new key or keys for such positions if 
needed; and (3) to provide additional data 
for these positions based on the 39 occupational 
keys of the Strong Vocational Interest Blank. 


Significance of Measured Vocational Interests 


Numerous studies have shown that successful 
workers in an occupation have certain similar- 


* Grateful acknowledgment is made to the 
of Minnesota Graduate School for researc 
support of this study. The authors are also indebted 
to Dr. D. Yoder, Director, and Dr. H. G. Heneman, 
Jr., Assistant Director, Industrial Relations Center for 
aid in preparation of the manuscript. 


University 
h grant in 


174 


University of Minnesota 


ities of interests (a characteristic set of likes 
and dislikes) which differentiate them from 
other occupational groups (12, 13). The 
Strong Vocational Interest Blank permits 
measurement of the occupational pattern with 
which a given individual’s interests most 
nearly coincide. Strong interprets the results 
as follows: “It is assumed that if a man likes 
to do the things which men like who are 
successful in a given occupation and dislikes 
to do the things which these same men dislike 
to do, he will feel at home in that occupational 
environment. Seemingly, also, he should be 
more effective there than somewhere else 
because he would be engaged, in the main, In 
the work he liked” (11). 

The Strong Vocational Interest Blank pro- 
vides ratings comparing the vocational interests 
of an individual with those of successful men 
in thirty-nine different occupations. Scores 
on this inventory are usually reported in terms 
of letter grades, A, B+, B, B—, C+, and C. 
That the Blank clearly identifies the majority 
of workers engaged in a particular occupation 
may be seen from the fact that approximately 
82 per cent of successful men or women in an 
occupation receive scores of A or B+, 14 pet 
cent attain scores of B or B—, and only 3 to 4 
per cent score C+ or C. 

Occupational keys on Strong’s Vocational 
Interest Blank are arranged in groups on the 
profile report form for the test in accordance 
with the inter-correlations between the keys 
Occupational keys which have an average 
correlation of .60 or more with other members 
of a group have been classified by Strong in the 
same general interest group. Thus, the keys 
for Personnel Director, Public Administrato™ 
Y.M.C.A. Physical Director, Y.M.C.A. Sec 
retary, Social Science High School Teache! 
City School Superintendent, and Minister fot™ 
what is usually referred to as the Soci? 
Service Group of the Vocational Interest 
Blank (Group V). This grouping is based 07 
correlations ranging from highs of .87 betwee” 
Y.M.C.A, Secretary and Y.M.C.A. Physic? 


Vocational Interests of Industrial Relations Personnel 


Director and an r of .75 between Personnel 
Director and Public Administrator, to a low 
of .41 between Personnel Director and that for 
Minister. 

The criterion group of Personnel Managers, 
on which the Personnel Director key of the 
Strong Blank was first standardized in 1928 
and again in 1938, consisted of only 147 
personnel men who were “carefully selected 
by competent authorities.” None was over 
60 years of age, with an average age for the 
group of 41.0 years. Average education was 
14.7 grades and all had been engaged in 
Personnel and industrial relations work for 
over three years. They were employed in 
usiness and industry located primarily in 
the New England, Middle Atlantic, Great 
Lakes, and Pacific Coast states. Even though 
the criterion group was carefully selected, the 
Number falls far short of the minimum number 
of 400 which Strong now believes is necessary 
or the development of valid and stable keys. 

€ fact that professionalization of personnel 
Work proceeded so rapidly during and after 

orld War IT also suggested the desirability of 
Checking on the validity of this particular key. 


Procedure 


P, Data for this study were gathered by mail. 
m nmel executives included in the survey 
‘re sent a copy of the Strong Interest Blank, 
to oi Answer Sheet! for recording answers 
le Strong Blank, and a data sheet designed 
eg assist the research staff in classifying the 
Spondent in the job classification appropriate 
5 his position, ‘The materials were sent to 
imate stil relations executives. ao 
hig ely 66 per cent returned the forms. ae 
Se percentage of returns (85 per e 
Jor, ceived from the original mailing to 10 
‘sonnel executives who had cooperated in 
or to the original 


utives were sent 
r sheet and placed 


clive Pnswers directly on the blank. A 


pa be ioner 
al modified 
t Blank fot 
yer Sheet- J- 


es 


175 


an earlier survey of the Industrial Relations 
Center in which detailed job descriptions of 
the work of these executives were obtained. 

In selecting the sample for study, only those 
persons in positions which are frequently filled 
by college graduates were selected. Lower 
level personnel positions, such as interviewers 
and personnel clerks, which are usually filled 
from the ranks without special academic 
training or with only on-the-job training were 
not included. Table 1 indicates the sources 
from which the sample was drawn and the 
percent of returns for groups to which the 
questionnaires were mailed. 

Each person who cooperated in the survey 
by returning completed Strong Blanks and 
data sheets was sent a complete profile 
reporting his standing on the 39 occupational 
keys and the 3 special keys for which the 
Strong Test is scored. 


Adequacy of Present Key 


Data presented in Table 2 indicate that the 
present Personnel Director Key is an adequate 
measure of the vocational interests of the in- 
dustrial relations directors, personnel directors, 
training directors, and wage and salary 
administrators who were included in the survey. 
The 24 employment managers, however, 
received a significantly lower percentage of 
A and B+ scores than did any of the other 
groups. , E 

These results are quite surprising. It was 
anticipated that Wage and Salary Administra- 
would be much more “legalistically 
minded” and “mathematically minded” than 
personnel directors and, hence, would exhibit 
less of the “social service” type of interest 
measured by the Personnel Director Key. 
But such did not prove to be the case. 

In general, it is concluded that the 1938 
Personnel Director Key is an accurate indicator 
of how the vocational interests of top industrial 
relations directors and other top personnel 
workers in business and industry agree with 
the interests of the original criterion groups 
used by Strong and how they differ from the 
jnterests of business and professional men in 
gales work, office work and types of executive 
work. For this reason, it is concluded that 


tors 


176 Kriedt, Stone, and Paterson 
Table 1 
Industrial Relations Executives to Whom Strong Blanks were Mailed and Number and Percentage Returned 
Number Number of Per Cent of 
in Each Returns for Returns for 
Personnel Groups Group Each Group Each Group 
Top industrial relations executives (50 concerned with labor rela- 
tions and 50 handling no labor relations) for whom detailed 85 
job descriptions were available in IRC files 100 85 75 
Sample of members of American Society of Training Directors 150 113 
Industrial relations executives who attended the IRC technical 
conference, “Conducting Wage Surveys” (held at the University 
of Minnesota in 1949), and industrial relations executives recom- 33 
mended for inclusion in the study by conference members 117 39 
Selected industrial relations executives on IRC regular mailing 
list 283 193 8 
Total 650 430* 66 


r 
* Forty-four of the returns from the group were excluded from the analyses in this report because twenty-fou 
blanks were incompletely filled out or contained errors which i ý 


the 1938 Personnel Director Key is not in 
need of revision at the present time. 


Patterns on All Occupational Keys 


The vocational interests of a given occupa- 
tional group are directly measured by the 
distribution of letter grades on the appropriate 
occupational key. In addition to this type 
of information, it is enlightening to note how a 
given occupational group rates on all of the 
occupational keys as well. In this way, one 
can determine those occupational groups that 
are least like the one under consideration as 
well as those which are more and more alike. 


Strong Personnel Director Key 


Table 3 gives these data for the subgroups 3$ 
well as for the total group by showing t 3 
percentage of A and B+’s obtained on eac? 
of the 39 occupational keys. 

The henvist coreene Ai of A and B+’s 
are produced by the Personnel Director an 
the Public Administrator Keys. In addition 
it is apparent that there is a goodly proportion 
of A and B+’s on the following keys: Produc- 
tion Manager, Social Science High Schoo 
Teacher, Office Man, Purchasing Agent, me 
three Sales Keys, and President of Manufactut 
ing Concern. Presumably, these industr 
relations men in addition to the high rating’ 


ive Groups of Personnel Workers on the 


Group N 
Strong’s Criterion Group 146 
Training Directors 113 
Industrial Relations Directors 124 
Wage and Salary Administrators 30 
Personnel Directors 86 
Employment Managers 24 


* Significantly different from Strong’s Criterion Group, 


Per Cent of Per Cent? 

Aand B+ ose Cand Ct 
Scores Scores Scores 
84.1% 13.1 2.8% 
86.7% san 3.5% 
84.7% 12.9% 24% 
82.1% 15.4% 2.5% 
19.1% 18.5% 24% 
62.5%* 29.2% 8.3% 

$ < 05. 


| 


Vocational Interests of Industrial Relations Personnel 177 
Table 3 
Per Cent of A and B-+’s on Each of Thirty-Nine Occupational Keys Obtained by Personnel 
Workers by Title and for Total Group 
Subgroups and Total Group 
Wage 
Train. Ind, Rel. end Sal. Persi a see! 
Dir. Dir. Admin. ir. Mgr. roup 
Occupational N=113 N=124 N=39 N=86 N=24 N=386 
Scoring Keys % % % % % % 
L Artist 2.1 0.0 0.0 0.0 0.0 0.8 
Psychologist 2.7 4.0 0.0 4.7 0.0 3.1 
Architect 4.4 0.0 0.0 0.0 0.0 1.3 
Physician 1.8 0.8 0.0 2.3 8.3 1.8 
Osteopath 12.4 10.5 15.4 93 20.8 11.9 
Dentist 18 0.0 0.0 0.0 4.2 0.8 
ienaa 08 0.0 12 0.0 0.8 
hematicia 0.9 5 
Foals i ician 18 08 0.0 2.3 0.0 1:3 
Engineer 49.5 7.3 28.2 12.8 12.5 14.5 
Chemist 8.8 3.2 17.9 5.8 12.5 7.5 
It, Production Mar 64.6 53.2 64.1 55.8 62.5 58.8 
IV. Farrer 10.6 5.6 15.4 14.0 20.8 10.9 
Aviator 142 8.9 20.5 9.3 25.0 12.7 
Carpenter 1.8 0.0 5.1 2.3 4.2 18 
Aen Ate 56 17.9 7.0 ee 8.8 
P A 12. hi 
ee and Phys. Sci. Tchr. a A ‘ cs se 4.2 me 
Olicema: 15. 
“Orest Servier Man 8.0 = 2a = 2 a 
DATY s 30.6 25.6 26.7 16.6 29.0 
.M.C.A. Phys. Director 32.7 347 921 79.1 62.5 824 
ersonnel Director 86.7 83.1 89.7 77.9 79.2 83.4 
ublic Administrator 86.7 ie 128 25.6 25.0 28.5 
LMC.. Secretary S a 38.5 41.9 41.7 45.6 
Oc. Sci. i ; j ; 
City = oer — 33.6 a a E ‘eS e 
Mi e 115 9.7 Z _ 
iia a 24 2.6 7 8.3 5.4 
an i 
VIN ep i 44 10.5 20.5 15.1 4.2 104 
i 23.4 187 30.2 125 25.6 
Ccountant 19. 63.7 53.8 45.3 29.2 47.4 
Office Man 32.7 29.8 46.2 43.0 20.8 35.0 
urchasin 33.6 20.5 19.8 20.8 16. 
g Agent 14.5 6.1 
anker 34.7 25.6 24.4 33.3 30.6 
Ortician S 631 33.3 54.7 29.2 56.0 
v Sales Manager 61.9 50.0 30.8 47.7 37.5 43.8 
Real Estate Salesman 2 51.6 30.8 ar 33.3 44.3 
= € Insurance Salesman a 79.8 12.8 32.6 25.0 29.0 
* Advertisin 31.9 8.2 20.5 18.6 16.6 23.1 
Lawyer Man 23.0 er 26 7.0 83 73 
Ppp tor Journalist = 39 38.5 32.6 29.2 33.2 
27.4 a 


EP 
Fes. Mfg. Concern 


178 


on the Personnel Key and the Public Admin- 
istrator Key break up into subgroups. Some 
probably have “primary patterns of interest”? 
or “secondary patterns of interest” in Produc- 
tion Management and President of Manufac- 
turing Concern. (For an explanation of 
primary, secondary, and tertiary patterns of 
interest see Darley, 2.) Others probably have 
a primary or secondary interest in office detail 
work. And others have a primary or second- 
ary interest in sales occupations. This interest 
in the sales occupations which is exhibited by 
almost half of these industrial relations men 
may reflect an interest in face-to-face contacts 
with people. On the other hand, it may be 
due to the fact that industrial relations work 
is a recent newcomer to the management fold 
and thus, in many companies, may require a 
salesman to put the program over. 

The guidance worker who is aware of the 
fact of these additional primary and secondary 
patterns of interest will be able to advise a 
greater variety of persons to enter the field 
than would be true if such a diversity of 
interest patterns did not exist. 

The absence of any large proportion of A 
and B+’s in Groups I (Biological Science), 
II (Natural Science), IV (Skilled Trades), VI 
(Musician), VII (C.P.A.), and X (Verbal- 
Linguistic) is worthy of note. In the guidance 
situation, a person with a primary pattern of 
interest in one of these occupational groups 
who is considering industrial relations work 
might well be advised to question the suitabil- 
ity of his choice, even though he might have 
an A or B+ on the Personnel Director Key. 
These observations are in line with Darley’s 
excellent discussion of the use of vocational 
interest measurements in the vocational coun- 
seling situation (2), 


Summary 


The vocational interests of five groups of 
industrial relations workers were measured by 
means of Strong’s Vocational Interest Blank 
to determine the adequacy of the Personnel 
Key. A total of 386 usable records were 
obtained for analysis. 

The five subgroups studied were: 


industrial 
relations directors, personnel director: 


S, employ- 


Kriedt, Stone, and Paterson 


ment managers, training directors, and wage 
and salary administrators. The Personnel 
Director Key appears to be an adequate 
measure of the vocational interests of these 
workers. Four of these groups received about 
the same percentage of A and B+ ratings as 
did Strong’s criterion group. The 24 employ- 
ment managers, however, received a signif- 
icantly lower percentage of high ratings than 
did the other groups (62.5 per cent versus 
about 80 per cent). 3 

Over 80 per cent of these industrial relations 
workers also received A and B+ ratings on the 
Public Administrator Key. Additional pt! 
mary and secondary patterns of interests are 
disclosed for the following keys: Production 
Manager, Social Science H. S. Teacher, Office 
Man, Purchasing Agent, the three Sales Keys 
and President of Manufacturing Concern. 

There is an equally significant absence O 
high ratings in the Occupational Keys falling 
into Groups I, IT, IV, VI, VII, and X. | ig 

The findings should prove of value 1n He 
vocational counseling of persons contemplating 
industrial relations work. 


Received February 28, 1952, 
Early publication. 


References 


+I tS 
1. Burk, S. L. H. The personnel profession Fn 
2s NO: 


Present and future status, Personnel Serie: 
74. New York: A. M. A., 1943. on of 
2. Darley, J. G. Clinical aspects and inter pretation y, 
the Strong Vocational Interest Blank. New Y° 
Psychological Corporation, 1941. 
3. Drake, C. A. Developing professional st 
for personnel executives. Personnel, 1943, 
6-655. sonal 
4 Heneman, H. G., Jr. Qualifying the professions. 
industrial relations worker, Personnel, 
25, 220-225. 
5. Kriedt, P. H., and Bentson, Margaret. 
industrial relations. Bulletin No. 3. 
olis; University of Minnesota Press, 1947. 
6. Kriedt, P. H., and Stone, C. H. College is, 


far personiel work. Personnel J., 1948» 


7. Kriedt, P, H, a 
tions positio; 
Release 1, 
sota Indust: 

8. Minnesota, Un 
ter. Indusi 

Minne: 


andards 
1 


Jobs * 
MinneaP” 


o5 
urse 
1, 


2 


wp pit 
nd Stone, C, H, Industrial ye 
ns and personnel. Mimeogri ne 
Minneapolis: University of M! 
tial Relations Center, August 
iversity of, Industrial Relation’ avo 
trial relations glossary. Bullet. es, 
‘polis: University of Minnesota 


JÅ 
> 
aaia 


Vocational Interests of Industrial Relations Personnel 


9. Parks, D. S. Survey of the training and qualifica- 
tions of practicing personnel executives. Toledo: 
University of Toledo, 1947 (mimeo). 

10. Spates, T. G. An objective scrutiny of personnel 
administration. Personnel Series No. 75. New 
York: A. M. A., 1944. 

11. Strong, E. K., Jr. Manual for vocational interest 
blank for men. Stanford: Stanford University 
Press, 1951. 

12. Strong, E. K., Jr. Vocaticnal interests of men and 
women. Stanford: Stanford University Press, 
1943. 


13. Super, D. E. Appraising vocational fitness by means 


179 


of psychological tests. New York: Harper and 
Brothers, 1949. 

14. Yoder, D. Professional associations in manpower 
management. Personnel J., 1948, 27, 43—46. 

15. Yoder, D., and Heneman, H. G., Jr. Quiz for per- 
sonnel executives. Mgmt. Rev., 1949, 38, 527- 
531. 

16. Yoder, D., and Nelson, Lenore P, Manpower 
managers—their habits, haunts, and customs. 
Personnel, 1950, 26, 413-417. 

17. Yoder, D., and Nelson, Lenore P. Personnel sal- 
aries and ratios in 1950. Personnel, 1950, 27, 


15-18. 


The Interests of Industrial Psychology Students * 


C. H. Lawshe and Stanley Deutsch 


Occupational Research Center, Purdue U niversity 


Awareness of the role played by interests in 
successful accomplishment is increasing in 
psychological, industrial, and educational fields 
(1; 2, 443; 5). This concern is naturally 
connected with the problem of predicting 
areas of vocational success and degree of 
success. In the field of psychological interests 
Kriedt (3) has developed five keys for psychol- 
ogists using the Strong Vocational Interest 
Blank. The present study represents an 
application of Kriedt’s keys and their norms 
to graduate students majoring in Industrial 


Psychology. The purpose of the study was 
twofold: 


1. To determine how the Psychologist 
sub-keys for the Strong classified these students 
in terms of occupational grouping. Do the 
subjects tend to score higher on the Industrial 
key than on any of the other, as one would 
expect, or do the keys fail to discriminate for 
this sample? 

2. To determine whether or not the keys 


differentiate students according to their per- 
formance in graduate school. 


Procedure 


The Strong Vocational Interest Blank was 
administered to 37 male graduate students 
currently pursuing the Industrial Psychology 
curriculum at Purdue University and was 
scored on the five following Psychologist keys 
developed by Kriedt (3); Psychologist-in- 
General, Industrial Psychologist, Clinical Psy- 
chologist, Experimental Psychologist, and 
Guidance Psychologist. The Taw scores were 
categorized according to the letter grade 
norms, and a Chi Square test was run between 
the frequencies obtained and those which could 
be expected according to the norms. A Chi 
Square test was also used to see whether or not 
the subjects scored significantly higher on the 
Industrial key than on the other keys, 


* The authors are indebted to Mr, Brya: 
assistance in preparing this manuscript for publ et 


In the study of the ability of the keys E 
differentiate among the varying degrees © 
success of the students, some measure of this 
success was necessary. The criterion used e 
a paired comparison rating of the students i 
the four professors in the Occupation 
Research Center. These professors rated sY 
those students whom they felt qualified E 
rate (6). The criterion reliability was deni 
mined by intercorrelating the ratings of a 
professors on the 23 students rated in gom 
by all four. The average of the matrix y 
intercorrelations was found using pishete 
transformations, and an estimated tli E 
of .87 for the pooled ratings was obtaine 
means of the Spearman-Brown formula. 1 for 

A composite criterion score was obiman the 
each student by combining the ratings 0 ere 
professors. The ratings of each judge W as 
weighted according to their reliability a 
determined by Shen’s formula (4). The vate 
posite of the ratings was correlated with 0 
scores for each of the five Psychologist keys 


. oted: 
determine what relationships, if any, exist 


Results 


The results of the Chi Square test poe | 
the frequencies of occurrence of A an with 
grades for each key on the sample groups “’; 


n P 
Kriedt’s normative population are show! jes 
Table 1. 


ci 
It can be seen that the freque? 


aring 


Table 1 


Expected and Obtained Frequencies for 
A and B+ Categories 


Expected Obtained P 

Key Frequency* Frequency M 
Industrial 33 35 63! ‘00! 
Psy-in-Gen’] 31 20 22.13 ‘oo 
Clinical 31 20 22.13 o0 
Experimental 33 5 212.0 0 


Guidance 34 30 444 


* Based on results obtained by Kriedt. 


180 


s —— 
m EE EO ES 


Interests of Industrial Psychology Students 


for the Purdue group did not differ significantly 
from the norms on the Industrial key, but for 
each of the other keys this sample obtained 
fewer A and B+ grades than did the criterion 
groups. In each of these cases the difference 
was significant at the 4% level of confidence 
or better. 

The frequency of A and B+ scores received 
by the Purdue students on the Industrial key 
Was compared with the frequencies for the 
other four keys through use of the Chi Square 
test. These differences between keys were 
found to be significant at the 1% level of 
Confidence in all cases except for Guidance 
where the significance level was 8%. Thus, the 
keys properly located these students 1n the In- 
dustrial Psychology interest area. , 

The correlation of composite paired compari- 
Son ratings with the scores obtained on the 
five Strong keys did not produce any coef- 
ficients significantly different from zero, even 
after correction of the data for curtailment of 
range (6, 173). Thus the results of phase 
two of the study demonstrated that the 

tiedt keys did not discriminate, on the basis 
of judged performance, among the students 
Currently studying Industrial Psychology at 

urdue, 


181 


Summary 


The application of Kriedt’s keys and norms 
has correctly identified the group of students 
as professionally interested in Industrial Psy- 
chology with greater than chance accuracy. 
There was some indication of an interest in 
Guidance present. The group scored much 
higher on the Industrial key than on any of 
the others, but the keys failed to differentiate 
students according to their judged performance 
in graduate school. 


Received July 9, 1951. 


References 


1. Blum, L. P. A comparative study of students pre- 
paring for five selected professions including 
teaching. J. exp. Educ., 1947, 16, 31-65. 

. Fryer, D. The measurement of interests, New York: 
Holt & Co., 1931. 

3. Kriedt, P. H. Vocational interests of psychologists. 

J. appl. Psychol., 1949, 33, 482-488. 

4. Shen, E. The reliability coefficient of personal 
ratings. J. educ. Psychol., 1925, 16, 232-236. 

. Strong, E.K. Vocational interests of men and women. 
Stanford: Stanford Univer. Press, 1943. 

6. Thorndike, R. L. Personnel selection. New York: 

John Wiley & Sons, 1949. 


Nn 


wn 


Interest Patterns and Retention and Rejection of Vocational Choice 


Arthur L. Traphagen 
Counseling Center, University of California, Berkeley 


In counseling at the college level, the follow- 
ing questions frequently arise: (1) Why do 
some college men who have expressed a 
serious vocational interest in secondary educa- 
tion retain this objective while others who have 
expressed the same serious interest reject it 
after a more thorough investigation of the 
field? (2) Do the factors of interest and level 
of aspiration discriminate between men who 
elect and then reject the field of secondary 
education, and those who elect and then retain 
this field? 

Most research relative to teacher selection 
has been confined to comparing mental ability 
and personality factors, mainly masculinity- 
femininity of interests, of teachers and of 
students of education, with other professional 
and pre-professional groups. These studies 
present either conflicting evidence or show 
little difference between the groups studied. 
Male students of education were found by 
Nance! to have more feminine interests than 
college men and men-in-general with men in 
elementary education having more feminine 
interests than men in secondary education. 
In studying male students in five professions, 
Blum’ found that college trained men in general 
have a tendency toward feminine interests, 
Education and journalism attract men with 
tendencies toward feminine interests to a 
greater extent than do law, medicine, and 
engineering. On the Strong Vocational Inter- 
est Blank each student tended to score highest 
in his own occupational group. Blum con- 
cluded the greatest difference between the 
groups studied was one of interests, Strong? 
states that among high school teachers there 
are sharp differences in the interests of various 
tive teaches, J. adue Res (eas gp Tity in prospec: 

rn fo cel rte a ent re 
paring ae S47 16, PEE ons including teaching. 

2 Strong, E. K., Jr. Vocational interests of men and 


women. Stanford University, California: Stanfi d Uni- 
versity Press, 1943, p. 161. CHESA 


subject matter teachers; no common interest 
is apparent. p 

Although being aware of the above investiga- 
tions, the author, as a result of having coun- 
seled a considerable number of students posing 
this problem, had begun to develop the _ 
following hypotheses: 


1. Although both groups, those who retain 
teaching as a vocational objective and those 
who eventually reject teaching, would be 
expected to have an interest in people for their 
own welfare, the discriminating factors woul 
be the intensity of this interest and the exist- 
ence of other or competing interests. 

2. There is probably a difference in level 
aspiration for the two groups with those 
rejecting teaching having the higher aspirations: 

3. There is little difference between the tw? 
groups in masculinity-femininity of interests: 


s of 


Procedure 


To investigate these hypotheses and attemp? 
to answer the questions posed, a pilot study 
was undertaken using data from the Counseling 
Center, University of California, Berkeley: 
The Strong Vocational Interest Blank i 
Men, 1938 revision, was selected as the evaluat 
ing tool. 

Subjects. The population from which p 
sample was obtained consisted of vetera” 
who had received counseling at the Univers! A 
of California Counseling Center betwee! 
October 1946 and July 1950, A total of 
male college students compose the sample. b 

The 60 records studied were selected a 
taking: (1) the first 30 cases, beginning wa 
case number 3500 begun in February 19 f 
wherein the objective stated at the incepti? 
of counseling was “high school teacher” E 
recorded as such at the conclusion of counie 
ing; and, (2) the first 30 cases wherein t? 
objective of “high school teacher” was st? 
at the inception of counseling but a differe” 
one recorded at the conclusion of counseling 


182 


Interest Patlerns and Vocational Choice 


The age range of the sample is 21 to 33 for 
those retaining the teaching objective and 21 
to 39 for those rejecting it. 

_ Method. The cases were classified as “‘objec- 
tive retained” and “objective rejected.” Data 
Were tabulated separately for each group. 
The individual Strong Record Blanks were 
analyzed for primary and secondary interest 
ihe and A or B scores on the two teaching 

‘YS. 

Realizing that interests of adults have to be 

Pe cepted and dealt with regardless of their 
Raa th, the author decided to evaluate i 
0 ey, Blanks in the manner in which he tends 

etefore te them in the counseling yan 

n Diraiene this invesitgation or, a3 
phabetic; e method of standare K 
Second ical classification for primary 

dary patterns a relative system was USPC: 


Desci 1 
ach record blank the occupational groups 
ores 


net w 
(thos ‘ning a preponderance of positive SC 
a € which fall to the right of the shaded 
Score) erdless of numerical or alphabetica 
gri were compared for strength and the onè 
g “atest magnitude was called a primary 
and the one of lesser magnitude s 
AY group. If two or mor pationa' 
eae of equal magnitude, 
the sa ition thus P 
Ore thg ame position 
econ one primary or secondary group Lge 
*ecorq blank, Each profile of interests 0n 
o c lank is rated intrinsically. tor’s 
inter Ck the reliability of the investiga 
Teg aenctation of the interest pattems, ie 
after Sified the patterns on each record a 
89 ay lapse of ten days and ob ta er 
uns Cent agreement with himse t (nia 
AAN at the University of Califor a 
fen eon Center classified the pete a 
S from each of the two gour “ith 


Sh 
the; Owed an 85 per cent agreemi 


Toy s 


r 


pulated for 

grouP 
difference 

condaries, 


Con ma dard scores on t 
i are Y and Occupational Leve 
Sang, for the two groups ÎI” er 


183 


Results 


As seen in Table 1, certain occupational 
groups and individual keys on the Strong are 
found to be discriminating for selecting or 
rejecting teaching at the secondary level. 
Only when these groups are primary, however. 
are they significant discriminators at the 5%, 
level of confidence. i A 

A primary Group V (generally considered to 
be an interest in people for their welfare) and 
an outstanding score of A or B, on the Math- 
ematics-Physical Science Teacher or Social 
Science High School Teacher keys discriminate 

positively for the retaining of a secondary 
objective. Primary Groups IX 


education ve. 
(sales), X (verbalistic), and XI (executive) 
discriminate positively for the rejection of a 


secondary education objective. The indica- 
tions are that when one scores high on Group 
y with an outstanding score on one or both of 
the teaching keys he will probably retain the 
objective of high school teaching. On the 
other hand if he has primary patterns in 
Groups IX, X, and XI or any combination of 
them he will probably not find high school 
teaching agreeable and will reject it in favor of 
another field. 

On the OL Scale the range of scores for the 
two groups are: retaining, 37 to 62; rejecting, 
40 to 71, with both distributions approximating 
k normal curve. However, the men who 
rejected teaching have a mean score of 56.2 
whereas those who retained the objective 
have a score of 51.3. This tends to indicate 
that those who rejected teaching probably did 

f a greater aspiration for prestige, 
es and greater professional status 
or recognition. It should not be inferred that 
thé teaching group is not ambitious, but is 
probably less ambitious for executive respon- 
sibilities, renown, or income. - 

No difference in masculinity-femininity of 
interests iS apparent from the range of scores 
on the M-F Scale, 26 to 65 for the retaining 
group and 17 to 63 for the rejecting group, or 
from the mean scores, 43.5 for the retaining 
aad 40.2 for the rejecting group. — However, 
the distribution for the À retaining group 
z proaches a normal distribution while the 

P group shows a skewedness toward 


jecting 
the feminine end of the scale. 


higher incom 


184 Arthur L. Traphagen 


Table 1 


Distribution of Primary and Secondary Interests on the Strong Vocational Interest Blank for the Sample Groups 
à Retaining and Rejecting the Objective and the Significance of Differences of Percentages 
s between the Two Groups for Each Interest Group 


sonal Math. Soc. S. 

— I E mW iW ov vi VE vil IX X XI Teach. Teach. 
Primary for : 

Retaining 4 3 4 8 © 2 io 3 4 a 13 19 
Primary for 5 

Rejecting 224 7 8 tH © & nw HH & 2 1 : 
% Retaining 13310 13 27 6 4 3 0 0 13 3 43 ó 
% Rejecting T T 8 10 37 30 10 7 37 37 20 7 8688 
Secondary for 

Retaining 6 & 2 @ 5% ða g 4& 
Secondary for 


Rejecting 9 5 2 3 7 


tin 2 6 4 3 6 | 
% Retaining 20 10 7 7 17 17 7 13 13 20 10 
% Rejecting 30 17 7 10 23 13 T 20 13 10 20 
Primary plus 
Secondary for 


Retaining 10 ó ó 10 24 17 3 4 7 10 4 
Primary plus 
Secondary for 
Rejecting 11 7 9 6 18 g t 8 15 14 12 
NG Retaining 33 20 20 33 80 57 10 13 23 33 13 
% Rejecting 37 23 30 20 60 43 17 27 50 47 40 
Critical Ratio * 
Primary TI 42 103 175 216* 83 109 4 45 * 5 : w 
Eo 452.70" 218* 215* 3.6 


Secondary : .90 .79 0 42 60 
Critical Ratio 
Primary plus 


Secondary 33 27 -90 1.18 


43 0 a3 0 1.08 1.08 


166 14 79 14 
* Significant at the 5% level. 


The question might be raised at 
as to whether these results might 
direct outcome of the biases of the c 
That is, was the vocational ch 
“objective is confirmed by 
“counselee was successfull 


this point S 

not be a ummary A af 

ounselors. The Strong Vocational Interest Bla” of 

oice a case of ‘WO groups were compared: (1) A grog 
ive 


counselor” or 30 male college students who Layer ' 


1 _ su y dissuaded from School teacher” 
teaching”? This is a recognized Possibility êt the beginning o 
but if it was the case, the biases were unfor- this objective at 


mulated since no stereotyped interest pattern and, (2) a group 


for the occupation of high school teacher had Who originally list 
been developed or expressed by the staff. 

The only known instances wherein persons had 
been urged to relinquish teaching as an objec- 
tive were cases where a Group V interest was 
non-existent. 


as the vocational © 
f counseling and who re 
the conclusion of cou” 


tain! 
jing’ 
se t 


of 30 male college StU“, 45 
ed “high school teach? ti 
ective but who reject" n of 
bjective at the conclus 


the vocational obj 
favor of another ol 
counseling, 
Although the N 
following was foun 


ne 
à 1.¥ 
for each group is small 


d: 


Interest Patterns and Vocational Choice 


. 1. Only primary interest patterns are signif- 
icant discriminators at the 5% level of con- 
fidence. 

2. A primary Group V and a score of A or B 
on the Mathematics-Physical Science Teacher 
and/or Social Science High School Teacher 
keys discriminate in favor of the retention of 
high school teacher as an objective. 

3. Primary Groups IX, X, and XI discrim- 


185 


inate against the retention of the objective 
“high school teacher.” 

4. The men who rejected high school teach- 
ing during the counseling process averaged 
five points higher on the OL Scale than those 
who retained the objective. 

5. No apparent difference in masculinity- 
femininity of interests was found to_exist 
between the two groups. 


Received July 13, 1951. 


A Validation of the SRA Youth Inventory 


A. J. Drucker and H. H. Remmers 


Division of Educational Reference, Purdue University 


The SRA Youth Inventory has been designed 
as a tool to help teachers, counselors, and 
school administrators identify quickly the 
problems that young people say worry them 
most. Although validity is generally prefer- 
ably determined against an outside criterion, 
the instrument is supposed to provide an 
indication of what a student thinks are his 
problems. For this, the authors argue, there 
is “no obvious or readily available outside 
criterion. The items which an individual 
checks have validity for that individual. As 
long as the student thinks that certain things 
bother him it makes little difference whether 
the problems are real or whether he is uncon- 
sciously exaggerating their importance. The 
measure of validity becomes, in a sense, the 
reliability coefficient, for no test can be any 
more valid than it is reliable.” 

In addition to such statements of definitional 
validity, however, it is of interest to test the 
instrument’s predictive validity. Can the Youth 
Inventory rank teen-age groups on the basis 
of good adjustment as recognized by trained 
personnel, the first step to predicting the 
quality of their adjustment? 

The Youth Inventory was administered to 
all of the 7th and 8th grade pupils in an urban 
school in northern Indiana. A total of 392 
sets of responses were received and scored at 
Purdue University. Meanwhile, before the 
scores were known, eight regular school coun- 
selors were instructed to designate the 20 per 
cent of the school’s 7th and 8th grade popula- 
tion that they knew and considered best 
adjusted and the 20 per cent that they knew 
and considered least well-adjusted. 

On the basis of ratings received, students 
were put into five different groups: (1) those 
who had received “good” ratings from two or 


more of the eight counselors and no “Door” 

1 Remmers, H. H., and Shimberg, B. Exami 
Manual for the SRA Youth Inventory. Chicago: Set. 
ence Research Associates, 1949, 


ratings or those whose “good” ratings exceeded 
the “poor” ratings by 3 to 1 or better; (2) 
those who had received only one rating and 
that rating having been a “good” rating; (3 
those on whom there was less than 3 to : 
unanimity of agreement of the counselors; 
(4) those who had received only one rating an 1 
that rating having been a “poor” rating; a 
(5) those who had received “poor” ratings from 
two or more of the eight counselors and ng 
“good” ratings or those whose “poor” ratings 
exceeded the “good” ratings by 3 to 1 or hetei 
Some of the disagreement among counselors 
becomes apparent when computing the as. 
portions of pupils in the five groups. Students 
fell into Groups 1 to 5 in the following ae 
Group 1 = 53; Group 2 = 83; Group 3 = 47 
Group 4 = 52; and Group 5 = 68. 2 
Analysis was based on Groups 1 and 
versus Groups 4 and 5 omitting GrouP. 
entirely, for a total N of 256. Combin 
Groups 1 and 2 gave 35 per cent of the 5 r 
and combining Groups 4 and 5 gave 31 p 
cent of this total. Results of testing ina 
ences between sets of means computed for ES 
two groups appear in Table 1. Differen¢ 5 
between the high and low adjustment gronh 
were significant at or beyond the one per ea 
level for all areas except “Looking Ahe™ re 
dealing with vocational and educational fatu 
and “Things in General” which deals W 
problems of religion, philosophy and attity 
concerning the world about the indivi aa 
Differences for these two areas were not sikm 
icant. Larger “Student’s” /’s are obtained | 
Basic Difficulty scores (indicating poten ry 
maladjustment) and Areas 1 and 3 ly 
School” and “About Myself.” Practical 
these findings make a pretty good cas? g 
validity of the Youth Inventory, where ? a 
used to predict 7th and 8th graders’ adjust? 
as judged by trained counselors. pe 
" Ti: last two columns of Table 1 give l 
indication of the representativeness ° 


186 


Validation of the SRA Youth Inventory 


Table 1 


SRA Youth Inventory Mean Scores and Standard Deviations of Well and Poorly Adjusted 7th and 8th Graders 
and of a National Sample of 7th and 8th Graders 


187 


Poorly Adjusted 


Well Adjusted 


National Sample 


Northern Indiana sample as compared W 
the group of 1,000 7th and 8th graders on 
whom the norms of the Youth Inventory are 
ased. In terms of mean scores, the Indiana 


problems in a 
national sample. 


Received June 14, 1951. 


Mean Sigma Mean Sigma Mean Sigma 
Score N=120 N=136 Student ?’s N=1,000 
Basic Difficulty 19.1 16.2 10.6 10.7 5.01 16.7 13.8 
Area 1—My 
~ School 7.7 5.2 44 4.2 5.61 6.8 4.7 
Area 2—Looking 
Ahead 9.4 74 9.2 75 01 10.6 7.3 
Arca 3—About 
„ „ Myself 10.6 8.4 5.9 5.7 5.26 9.1 7.2 
Area 4—Getting 
Along with Others 9.2 8.8 6.2 6.4 3.16 9.4 8.0 
Area 5—My Home 
and Family 65 93 2.5 49 4.38 5.4 7.2 
Area 6—Boy 
Meets Girl 49 6.4 3.1 4.2 2.63 5.9 6.1 
pea 7—FHealth 34 4.1 2.2 2.5 2.71 3.9 3.7 
rea 8—Thi 
i Getieral cy aCe 32 48 1.73 s7. 66 
ith subjects, as a group, check somewhat fewer 


ll areas than do subjects of the 


Paired Comparison Ratings: 


2. The Reliability of Ratings 


Based on Partial Pairings 


Ernest J. 


Occupational Research 


McCormick 


Center, Purdue University 


and 


William K. Roberts 
Aero-Medical Laboratory, Dayton, Ohio 


In a previous investigation McCormick and 
Bachus (2) examined the effect on paired 
comparison employee ratings of reductions 
in the numbers of individuals within the 
total group with whom each was paired. 
Various “patterns” of partial pairings were 
developed, each of which resulted in a different 
number of pairs per individual, though for 
each pattern each individual was paired with 
an equal number of others within the group. 
By correlating the rating indexes resulting 
from each pattern with the rating indexes 
resulting from a complete pairing, it was found 
that substantial reductions could be made in 
the number of pairs without affecting materially 
the rating indexes of the individuals. 

With groups of 50 individuals, for example, 
it was found that the total number of pairs 
could be reduced from 1225 to about 425 and 
still give correlations of about .95 between 
the ratings so derived and the original ratings 
based on complete pairings. Somewhat par- 
allel results were obtained with groups of 30 
individuals. 

Since the pairs included in each partial 
pairing, ‘however, were also included in the 
complete pairing, and would accordingly affect 
somewhat the resulting correlations, the results 
of that investigation cannot be considered as 
representing specifically the reliability of the 
partial pairing technique. For the purpose of 
examining reliability, it would be necessary to 
determine the extent to which mutually- 
exclusive sets of pairs produce essentially 
comparable results. 

It was therefore the purpose of the present 
investigation to examine the reliability of the 
partial pairing technique as applied to em- 
ployee rating. 


Experimental Procedures 


Employees Rated. The employees used a 
McCormick and Bachus (2) were also usec 
as the subjects of the present investigation 
Two independent groups of 50 employees eac? 
from different departments in a manufacturing 
company had been rated by their respective 
foremen. Group I consisted entirely of women 
who worked in the assembling department; 
these workers were engaged in the task 0 
assembling small parts of electric meters: 
Group II consisted of 48 women and 2 me 
who worked in the machine department; 
these workers were engaged in forming aM 
finishing small parts to be used in electr! 
meters. 

Rating of Employees. In the previous study, 
IBM cards that had the names of each pair ° 
individuals (1,225 pairs in each group) tYP® 
on them were “sorted” by machine 1? 
random order. The cards were then presente 
to the foreman in this random order W! ‘ 
typewritten instructions to mark, for oe 
pair, the employee whom he considered tO f 
doing her (his) present job better. Each ‘s 
the two foremen rated only the members of i 
own group. 

Patterns of Mutually-Exclusive Sets of P 
Various “patterns” for determining mutua 
exclusive partial pairings were developed. s 
order to achieve mutually-exclusive pairing j 
two independent sets of pairs were develoP 2 
for each pattern; they were identified 45 t 
“x” and “o” sets, Sio pair of employ’ pe 
was duplicated in the two sets, each 0 bs 
two sets of samples of judgments for a 8", 
pattern was then completely independe” to 
hich was necessary in oF eE ne 


airs: 
Ily- 
n 


condition w 


establish adequately the reliability of 
188 


Paired Comparison Ratings 


sampling of partial pairings for the various 
Patterns. In pattern Q, for example, employee 
number one in the “x” set is paired with 
employees number 6, 7, 18, 19, 30, 31, 42, and 
43, whereas employee number one in the “o” 
set of the same pattern is paired with employees 
number 5, 8, 17, 20, 29, 32, 41, and 44. For 
certain levels of partial pairings it was impos- 
sible to devise two independent sets of pairs 
Using exactly 50 employees; the N’s for the 
different patterns therefore vary somewhat, 
Tanging from 47 through 50. 

These patterns, used for the mutually- 
exclusive partial pairings of the original groups, 
Provided for the pairing of each individual in 
the group with various numbers of other 
individuals as follows (the letter identifies the 
Pattern, and the number given is the number 
of pairs per individual per set): L-24; M-20; 
N-16; P-12; Q-8. 

In order to be able to examine for a smaller 
Stoup of employees the reliability of ratings 

ased on partial pairings, a random sample of 

Was extracted from each of the two groups 
of 50. The 30 individuals selected from the 
Original Group I were designated Group II 
and. those 30 individuals selected from the 
Miginal Group II were designated Group IV. 
as ese groups of 30 employees were the same 

used by McCormick and Bachus in one 
àse of their investigation. 
€ same procedure for 
ee and independent se 
for Groups III and IV as was U 
Toups I and II, with the exception < 
toe er N. For one of the patterns it was 
in Sary to use 29 rather than 30 individuals, 
Order to develop two mutually-exclusive 
Sets 5 EVER | tterns use 
an, of pairs of equal size. The parte’? © 
indi the corresponding numbers of pan ti 
M ‘vidual per set were as follows: ; 
712; R-8. r 
ne Character of the Patterns. For see F 

„em a tri lar matrix can be develop 
Which po o angular MATY -cation numbers, 
Which val indicate, by identificati Pehe “x” 
and in Individuals shall be parret ; i 

the “o” sets of pairs respectively s N’s 
for S €ach pattern there will be certai ae 
tum, eich the pattern results 10 EN f the 
two CT Of pairs per individual N’s 

° mut ` For other 
the ually-exclusive sets. abers 
Pattern would result in unequal "u 


determining the 
ts of pairings was 
sed for 
of the 
S 


189 


of pairs per individual in one or both sets. 
The N’s for which equal pairings do result, 
however, increase in multiples of a constant 
which can be empirically determined for any 
given pattern; this constant can be thought 
of as a rhythm, and at any given N at which 
this rhythm is complete there will be an equal 
number of pairs for each of the N individuals 
in each of the two sets. 

Figure 1 shows, as an illustration, the matrix 
with two mutually-exclusive sets of pairs (“x” 
and “‘o”) for pattern L for a group of 9 em- 
ployees, in which pattern each person is paired 
with 4 other employees in each of the two sets. 
This pattern has a rhythm of four, as indicated 
by the horizontal dotted lines; it could there- 
fore be used for the N’s of 5 or 9, as illustrated, 
or for larger N’s which increase in multiples 
of 4, as for example 13, 17, 21, 25, 29, etc. 

Figure 2 shows part of the first column of all 
the patterns used, either for the groups of 
approximately 50 or of 30. For any given 
pattern this column identifies the individuals 
with whom employee number one is paired for 
the “x” and for the “o” sets of pairs; the 
underscoring indicates the end of the rhythm 
of the pattern. An extension of the rhythm 
downward in the first column would identify 
the individuals with whom employee number 
one would be paired in larger groups. In 


EMPLOYEE NUMBER 
123 45 6 7 8 9 


EMPLOYEE NUMBER 
© ON On AUN 


lo | 
a 
x 
10) 
(0) 


al. 
Fic. 1. Illustration of matrix of two mutually- 


exclusive sets of pairs (“x” and “o” identify pairs 
included in respective sets). 


190 Ernest J. McCormick 


Ix OOK 


x O1IOK X OIOX X OF 


2 
3 
4 
5 
6 
7 
8 
9 
10 
n 


Ix COX 


Fic. 2. Segments of first columns of patterns of 
mutually-exclusive sets of pairs (“x” and “o” identify 
pairs included in respective sets; underscoring shows 
beginning and ending of “rhythm”). 


order to identify all pairs (for each of the 
two sets) the pattern, characterized by column 
one, need only be extended systematically 
downward and to the right to form a complete 
triangular matrix, as illustrated in Figure 1. 
Deriving Ratings Based on Various Patterns. 
In using the various partial patterns the fore- 
men were not required to re-rate the members 
of their groups. The cards containing the 
pairings required for a given set of a given 
pattern were extracted from all of the cards 
used in the initial complete pairing of each 


and William K. Roberts 


original group. A tally was then made for 
each of the two sets (“x” and “o”) of the 
number of times each employee was chosen 
over the others with whom he was paired in 
each set; rating indexes were then determined 
on the basis of these tallies. 


Results 


As indicated above, certain patterns were 
used with the two groups of approximately 
50 individuals, and other patterns were used 
with the two groups of approximately 30. 
For each group with which a parien a 
pattern was used, a correlation was compute 
between the rating indexes resulting from E 
two mutually-exclusive sets (“x” and “o”); sr 
a correlation then reflects the degree to which 
the pairs in the two independent sets result a 
comparable rating indexes. All of the correla- 
tions so derived are presented in Table 1. j 

Since each pattern had been used with pls 
samples (Group I and Group II, or Group I 
and Group IV) it was then possible, for eac? 
pattern, to test for significance the differences 
between the correlations for the two groups: 
This was done in order to ascertain whether tbe 
magnitude of the resulting correlations cou 
or could not reasonably be attributed to chance 
factors. d 

The correlations were accordingly conver te 
to Fisher’s z; the standard error of the differ- 
ence between those resulting from a 
pattern was then computed using the formu 


Table 1 


Correlations, for Two Independent Samples, 


Mutually-Exclusive 


of Scale Values Resulting from Two 
Sets of Partial Pairings 


Correlations 
for Two Groups 


Pairs per Total No. t 
Pattern N Individual of Pairs Group I Group II Ratio 
A 49 24 588 932 38 
93 919 x 
M 50 20 500 872 833 08 
N 47 16 376 896 838 1.13 
P 49 12 294 824 go 3 
Q 47 8 188 .824 723 1.17 
Group IIT Group IV 
ie 29 14 203 775 en 22 
M 30 12 180 896 874 37 
R 30 8 120 .700 .713 Al 


Paired Comparison Ratings 


Presented by Guilford (1, p. 224). For each 
pattern a ¢ ratio was then computed. These 
results are also presented in Table 1. It will 
be observed that none of these ¢ ratios even 
approximates the five per cent confidence limits 
(1.96). This would seem to suggest that the 
magnitudes of the reliability coefficients cannot 
then reasonably be attributed to chance 
factors, and that they presumably are largely 
a function of the patterns on which the rating 
Indexes were based. 

Reliability for Groups of Approximately Fifty. 
Tt can be seen that the correlations for ‘the 
Patterns used with the two groups of approx- 
imately 50 range from .932 and .919 for pattern 
L to .824 and .723 for pattern Q. The correla- 
tions tend to decline rather consistently with 
reductions in the number of pairs on which 
he ratings are based. P 
. Tna practical situation, the extent to which 
at Would be reasonable to reduce the number of 
Pairs on which ratings were to be based 
Probably would be largely a matter of the 
Purposes of the ratings. If the uses of a 
Particular set of ratings were such that high 
reliabilities would be essential, it probably 
Would be preferable to use a pattern of partial 
Pairing based on a fairly large number of pairs 
Per individual; if ratings with somewhat lower 
reliabilities could serve the intended purposes, 
à more marked reduction in the number of 
Pairs could be made. 
Reliability for Groups of 

tirly. The reliability for both gro 
ìtty individuals was highest for patt 


A pproximately 
ups of 
ern 


Whi A . i 
i ‘ch resulted in each person being ae ee 
Other e seems to be 
laus; s per set. Ther hat lower 


er indivi 
this Widual per set). g 
with drop in y could be taken to mean 


me jons 
in g Daller groups of individuals, EA 
€ numbers of pairs per individual resu 


o : : - 
Son Variability in reliability than ae 
ing proportionate reductions with larg 


8r à t 
ee S of individuals. It is also possi at 
the nique sampling of pairs for one or salted 


ins ‘ey Sets of pairs for this pattern T m 
Pairs € type of systematic bias 1n the a ae 
Which produced the lowered correlations. 


191 


Summary and Conclusions 


Two groups of 50 industrial employees were 
rated independently by their respective fore- 
men using the method of paired comparison; 
all possible pairs of employees were rated. 
Various patterns of mutually-exclusive partial 
pairings were developed; in order to achieve 
mutually-exclusive pairings, two independent 
sets of pairs were developed for each pattern. 
The various patterns reduced from 24 to 8 the 
number of individuals with whom each 
employee in each independent set of pairs was 
paired; the total number of pairs for the various 
patterns ranged from 588 to 188 (a complete 
pairing resulting in 1,225 pairs). 

In order to be able to examine the reliability 
with a smaller group of employees, a random 
sample of 30 was extracted from each of the 
two groups of 50. Basically the same proce- 
dures were used with the smaller groups as 
were used with the larger groups in developing 
patterns of partial pairings. 

Performance indexes were computed from 
the ratings made on the two independent sets 
of pairs included in each pattern. For each 
pattern the rating indexes derived from the 
two independent sets of pairs were then 
correlated. For the two groups of 50 individ- 
uals the range of these correlations was from 
932 (for a pattern involving 24 pairs) to 723 
(for a pattern involving 8 pairs). For the 
two smaller groups of 30 individuals the 
correlations varied between .896 and .700; the 
patterns involved reductions in the number of 
pairs per individual from 14 to 8. 

It should be kept in mind that the reliability 
involved in this analysis is that of the sampling 
of the original judgments made by the raters, 
and is not then strictly the reliability of the 
original judgments themselves. 

On the basis of the results of the experiment 
the following conclusions seem warranted: 


1. Using the paired comparison system, the 
reliability of ratings obtained with partial 
pairings (which markedly reduced the total 
number of pairs to be rated) for the larger 
groups of 50 individuals tends to decrease 
rather systematically with reductions in the 


192 Ernest J. McCormick and William K. Roberts 


number of pairs per individual on which the tionate reductions in the numbers of pairs 
ratings are based. per individual. 
2. For groups of 50 individuals, ratings Received May 28, 1951. 
based on as few as 16 pairs per individual 
appear to be relatively reliable. References 


3. For groups of 30 individuals, the reliabil- 1- Guilford, J. P. Fundamental statistics in psychology 

ities of ratin . based i a and education. New York: McGraw-Hill Book 
gs based on partial pairings are Co., Inc., 1950. 

somewhat lower, and seem to be somewhat 2. McCormick, E. J., and Bachus, J. A. Paired com- 

more variable, than ratings for larger groups of parison ratings: 1. The effect on ratings of re- 


indivi 3 ductions in the number of pairs. J. appl. Psy- 
individuals based on corresponding propor- chol., 1952, 36, 123-127. 


The Effect of Varying Intensities of Illumination upon 
Performance on a Motor Task 


Emest J. McCormick 


Occupational Research Center, Purdue University 


and 


Jarold R. Niven 


International Harvester Co. 


Tn the field of illumination, much emphasis 
has been placed upon the study of the proper 
light intensity to be used for specific types of 
Visual tasks (3, 5). Luckeish (1) has pointed 
out that intensities, as suggested in lighting 
Specifications, have been increased over a 
Period of years for these various visual tasks. 
Such specifications have been established for 
the school, the home, the office, and for 
Industry (3). 

Some controversy, however, has arisen over 
the recommended intensities set forth in these 
Various lighting codes (2, 5). The proponents 
of one school recommend as high intensities as 
Possible while others argue that optimum levels 
of Intensity exist for various types of jobs. 

Tt was the purpose of this investigation to 
examine the effect of three intensities of 
Vumination upon the performance of a motor 
task that required visual control. The three 
€vels of illumination included a range suffi- 
“ently large to determine the existence of an 
°Ptimum level if it should be present. How- 
Cver, further investigation utilizing inter- 
Mediate levels of illumination would be needed 


o 3 : 
¥ establishing the optimum level. 


Experimental Procedure 


The Test U e test used a 
investigation eg n Hand piao 
a (4, p. 128-129), The test was — 

_the investigation since there 1S reason 9 

“"eve that the combination of visual ane 

Otor skills involved in the test are typical © 
inde that are also required on $ 
Ong} trial jobs. Performance 0” | 

Sts of inserting a stylus succes 


for the 


three holes that are uncovered by a shutter 
which rotates at a rate of 126 holes per minute. 
The three holes, each .5 of an inch in diameter, 
are in a metal disk beneath the rotating shutter; 
the three holes form an equilateral triangle 
measuring 3.5 inches on a side. Scoring is in 
terms of errors which occur when the stylus 
comes in contact with the side of the hole, 
with the disk, or with the rotating shutter; 
the scoring is done electrically through a 
counter that registers an “error” when the 
metal stylus touches the side of the hole, the 
disc, or the shutter. 

Conditions of Illumination. Three levels of 
illumination were provided in the experimental 
conditions; these levels were 5, 50, and 150 
foot candles, respectively. Illumination for 
these three levels was provided by electric 
lamps mounted on a frame 52 inches square at 
a distance of 55 inches above the work table. 
Constant voltage transformers were used in 
the lines. A Watson Illumination Meter was 
used to determine the intensity levels at the 
work place; control of the levels of illumination 
was accomplished by means of rheostats. 

Administration of Test. Subjects consisted 
of 27 of the typical experiment-prone elemen- 
tary psychology students; all were males, 
with an age range of from 19 to 37 and a 
median age of 22. 

The test was performed by the subjects 
under the three levels of illumination in a 
counterbalanced order of presentation in order 
that the effects of fatigue, learning, practice, 
and motivation were kept relatively constant 
for all three conditions. š 

The subjects were divided into three groups; 
the groups took the test under the three 


193 


194 


illumination levels in the manner outlined 
below: 


Ernest J. McCormick and Jarold R. Niven 


Table 2 


Differences between Mean Scores 


Sequence of Illuminating 


Group N Conditions (foot candles) 
1 9 5 50 150 

9 50 150 5 

3 9 150 5 50 


Five minutes were allowed for adaptation 
when going from the highest to the lowest 
levels and three minutes between the remaining 
levels. 

A practice period on the test of one minute 
was given each subject. A score for each 
subject under each condition of illumination 
consisted of the number of errors which he 
made on the test in a one minute period. 


Results 


The mean error scores of all 27 individuals 
under foot candle levels of 5, 50, and 150 were 
31.56, 25.26, and 23.00, respectively. Perform- 
ance improved with increasing levels of 
illumination, as reflected by decreases in mean 
error scores on the test. The decrease in error 
scores from the condition of 5 foot candles to 
50 foot candles, however, was much greater 
(31.56 to 25.26) than the decrease from the 
condition of 50 foot candles to 150 foot candles 
(25.26 to 23.00). 

An analysis of variance was made as a 
preliminary statistical analysis; the results are 
summarized in Table 1. The F ratio for the 
levels of illumination was significant beyond 
the 1% level of confidence, indicating that 
illumination was undoubtedly affecting per- 
formance on the test. In order to analyze the 


Table 1 


Analysis of Variance 


Sum of Mean 
Source of Variance d.f. Squares Squares F 
Illumination levels 2 1,061.51 530.76 10.57* 
Subjects 26 12,369.00 415.73 9.48* 
tror 52 2,611.85 50.23 
Total 80 16,042.36 


* Significant beyond the 1% level of confidence, 


Level of 
Illumination 
in Foot ie 
Candles Means 4” ratio 
pi a a ee 
5 31.56 3.26%" 
50 25.26 
5 31.56 443" 
150 23.00, 
50 25.26 1.17* 
150 23.00; 


* Significant at the 30% level of confidence. 
** Significant beyond the 1% level of confidence. 


more specific effect of levels of illumination, 
the significance of the differences between mean 
performance scores for the three illumination 
levels were then computed; the resulting t 
ratios are summarized in Table 2. Significant 
differences were found between mean perform- 
ance scores of the 5 foot candle and the 50 
foot candle levels, and of the 5 foot candle and 
the 150 foot candle levels. The difference 
between mean performances at the 50 foot 
candle and 150 foot candle levels, howevel 
was not significant. 


Summary and Conclusions 


Twenty-seven subjects were tested with the 
Purdue Hand Precision Test at three levels 
of illumination, 5 foot candles, 50 foot candles 
and 150 foot candles. Performance increase 
with an increasing intensity of illuminatio”: 
Significant differences in performance (one per 
cent confidence level) were found between th¢ 
5 foot candle and 50 foot candle levels, ari 
between the 5 foot candle and 150 foot cani z 
levels, but the difference in performan 
between 50 foot candles and 150 foot candle 
was not significant. | 

The study Suggests that increasing the leve 
of illumination beyond 50 foot candles will 2° 
have any significant effect upon performans 
on a task of this nature. There is then th 
implication that, in terms of performance 9 
a task of this nature, the optimum level a 
illumination is Somewhere between 5 foo r 
candles and 50 foot candles; since intermedi 


Effect of Varying Intensities of Illumination 195 


levels between these two points were not 
investigated, however, the study does not 
reveal! the illumination level (between these 
limits) which would be optimal. 

No attempt was made to determine the 
Physical or psychological effect of intensity of 
illumination other than that of performance 
on the task. 


Received J uly 2, 1951. 


4. 


5 


References 


. Luckeish, Matthew. Foot candles for critical see- 
ing. Illum. Engng., N. Y., 1946, 41, 828-846. 

. Ryan, T. A. Work and effort. New York: Ronald 
Press Co., 1947. 

. Tinker, M. A. Illumination studies for effective and 
easy seeing. Psychol. Bull., 1947, 44, 435-450. 

. Tiffin, Joseph. Industrial psychology. New York: 
Prentice-Hall, Inc., 1947. 

. Winslow, A. How many foot candles? J. appl. 
Psychol., 1947, 31, 140-142. 


Dial Reading Performance as a Function of Color of Illumination + 


S. D. S. Spragg and M. L. Rock ° 


University of Rochester 


An earlier report (3) has presented data 
concerning the speed and accuracy of reading 
instrument dial reproductions as a function 
of the intensity of the white (Mazda) light 
provided. The present experiment is a com- 
panion study, relating dial reading performance 
to the spectral distribution of the illumination 
provided. 

The possibilities of colored light for instru- 
ment, cockpit, and similar illumination prob- 
lems have commanded considerable attention 
in recent years. Part of this interest has 
stemmed from the well-known fact that 
monochromatic illumination (such as that 
from a sodium vapor lamp) offers certain 
advantages in seeing small, near-threshold 
objects, expecially at low levels of illumination 
(1). Most of it, however, arises from problems 
pertaining to the achievement and maintenance 
of satisfactory levels of dark adaptation. 

The advantages which monochromatic light 
possesses with regard to acuity and speed of 
response are significant chiefly at low illumina- 
tion levels and tend to disappear at levels 
above 0.1 foot-lambert, according to Luckiesh 
(1, pp. 242-243). However the reading of 
instrument dials under night flying conditions 
may fall well within the conditions for which 
the advantages of monochromatic light have 
been demonstrated. 

Many studies have shown that illumination 
which is limited to the long wave-length end of 
the spectrum permits the human eye to dark 
adapt more rapidly and more completely (as 
well as to maintain dark adaptation better) 
than when the illumination contains wave- 
lengths from the middle or the short end of the 


1 The experiment reported here was conducted as part 
of a program of research on human factors related to 
aircraft instrument lighting carried out on a research 
contract (W33-038 ac18317) between the 
Rochester and the Air ne Comma: 
Forces, and has been reported in mimeogra 
to the Aero Medical Laboratory of oe ae tee 
Command (2). 

2? M. L. R. is now with E. N. Hay Associates, Phila- 
delphia. ? 


University of 
nd, U. S. Air 


spectrum. This fact has been put toa number 
of valuable military and civilian uses during 
the past ten years. 

Red lighting, however, has had some trouble- 
some aspects. One is the fact that red filters 
absorb a large percentage of the total visible 
energy flux emitted by the lamp source thus 
necessitating the use of relatively powerful 
sources in order to achieve a satisfactory 
brightness level. 

Evaluations of visual performance under 
different colored lights have at times been 
complicated by the factor of unequal bright- 
nesses of the colors employed. Some of the 
complaints about red lighting systems have 
been undoubtedly due, at least in part, to the 
fact that the sheer amount of light available 
is apt to be less than with other lights. 

In order to get as much light as possible 
through filters, use has been made of sharp 
“cut-off” rather than narrow-band filters. 
The former, as the name indicates, have * 


rapidly to zero below this point and, in the 
other direction, flattening out at 35-907 
transmission for all wave lengths longer thar 
this region. Narrow-band filters, althoug? 
much more nearly monochromatic, commo” y 
have a maximum transmission of not mot 
than 10-20% over a narrow band of yer 
lengths, falling quickly to zero on either sid 
of this region. Thus they transmit a be? 
small fraction of the total light flux incide” 
upon them. ; 
In consideration of the above factors and a 
view of the finding of a critical level of brigh k 
ness for dial reading performance in the preci 
ing study, it was decided for the present d 
reading experiment to set up the followin? 
primary specifications: (a) performance 
illumination levels above and below p 
foot-lamberts should be studied; (b) shang 
“cut-off” filters should be used; and (c) Da 
attention should be directed toward the 1° 


196 


Dial Reading Performance 


orange-yellow region of the spectrum, since a 
compromise between sufficient light flux and 
sufficient dark adaptation will very probably 
be found in this region. 


Method 


Apparatus. The apparatus used has been 
described in the preceding study (3). For the 
Present experiment light sources were two 
25-w. Mazda lamps in cans fitted with filter and 
aperture holders. An assembly consisted of a 
ground-glass square of heat-resistant glass, a 
colored glass filter, and a brass plate with: a 
circular aperture drilled in its center. Voltage 
control was maintained as in the previous 
experiment. 

Four pairs of Corning sharp “cut-off” filters 
Were used in this experiment. The character- 
istics of each are summarizedin Table 1. They 
Were chosen to give a good representation of the 
Ted-orange-yellow region, plus a comparison 
with the middle region (yellow-green). 

Materials, The same stimulus cards were 
used that were employed in Experiment I of 
the preceding study and the same method of 
achieving permutations of the dials was used. 
A total of 40 combinations of dial cards (12 
dials to a card) was employed. A given 
Combination of 12 dials was presented only 
once to each subject during the present 
experiment, (The same combinations had 
also appeared once during the preceding 
experiment.) 

Subjects. The sam 
Served in Experiment I of the prece 


e twenty subjects who 
ding study 


Table 1 


” 
Specifications of the Four Corning Sharp “Cut-off! 
Color Filters Used. Values in ™# 


Wavelength 
A) Beyond Wavelength 
Which A) Below 

Transmission ” Which i 
i is 80' ‘ransmi 
Fitter ‘Motte 180.5% Color 
umber 750 mp or Less Na 
3-72 541 416 Yellow-green 
3-67 558 524 Yellow-orange 
2-60 658 599 Orange-red 
i 678 617 Deep red 


197 


were used in the present experiment. A month 
or more elapsed between their participation 
in the two experiments. Subjects were paid 
for their services. 

Procedures. Instructions requesting both 
speed and accuracy and other general proce- 
dures were the same as in the preceding 
study. After preliminary practice subjects 
read 5 cards for each of the four colors at 
each of two brightness levels. Since the 
first and last dials read on each card were 
again omitted from the results, the data 
consist of 50 dials read by each subject for 
each color at each brightness level, or a total 
of 400 dials read by each of twenty subjects in 
this experiment. 

The illumination levels chosen were 0.01 and 
0.1 foot-lamberts. These values fall on either 
side of the critical 0.02 foot-lambert value 
determined in the previous study. They were 
chosen, rather than repeating the 0.018 and 
0.022 values used before, because of the 
difficulties and larger consequent errors in 
making hetero-chromatic brightness determin- 
ations with the Macbeth Illuminometer. In 
determining voltage values to be employed so 
that each color could be presented at 0.01 and 
0.1 foot-lamberts each of the writers made a 
number of observations with the Macbeth 
Illuminometer. The amount of disagreement, 
even for the lower brightness, did not exceed 
15 per cent, thus we can be reasonably con- 
fident that in all cases our “low” readings were 
below and our “high” readings above the 
critical 0.02 foot-lambert level determined 
previously for this dial reading task. 

Each subject reported for two sessions, 
several days apart. At the first session each 
subject read dials at both high and low 
brightness for two colors. At the next session 
he did the same for the remaining colors. 
Brightness sequences were varied in an ABBA 
and BAAB order, and the sequences of cards 
were systematically counterbalanced so that 
practice and fatigue effects would be equalized. 


Results 


The data of this experiment consist of 
error scores and time scores for 20 subjects 
reading dials under two brightness levels for 
each of four colors of illumination. These 


198 


S. D. S. Spragg and M. L. Rock 


Table 2 


Mean Performance in Reading 50 Dials under Each of Four Colors of Illumination 
and at Each of Two Brightness Levels (N = 20) 


0.1 Foot-lamberts 


0.01 Foot-lamberts 


Yellow- Yellow- Orange- Deep Yellow- Yellow- Orange- Deep 
Green Orange Red Red Green Orange Red Red 
Mean number of R 
errors 1285 133 13.0 11.75 20.1 209 207 23.5 
Standard deviation 2.74 2.78 1.90 2.05 4.26 5.91 5.72 7.75 
Mean reading time 5 
per dial, in seconds 1.37 1.36 1.30 1.34 2A1 2.16 2.10 2.25 
Standard deviation 0.52 0.57 0.46 0.56 114 137 118 1% 


data are summarized in Table 2. Error scores 
indicate the mean number of readings in error 
(without regard to magnitude of error) in 
reading 50 dials under each of the eight 
illumination conditions. Time scores indicate 
the mean reading time per dial in seconds under 
each illumination condition.? These data are 
also presented graphically in Figure 1, 

Analysis by ¢ test showed that all the 
comparable differences between the two bright- 
ness levels are highly statistically significant. 
Within each brightness level a series of £ 
tests was carried out between the means for 
the combinations of color pairs. These are 
summarized for error means in Table 3, and 
for time means in Table 4. 

From the / tests in Table 3 it is seen that at 
the 0.1 foot-lambert level performance with 
the deep red filter is significantly more accurate 
than with the O-R or the Y-O filters and some- 
what but not significantly better than with the 
Y-G filter. At the 0.01 foot-lambert level, 
however, performance with the deep red filter 
is significantly less accurate than with the O-R 
and the Y-G filters; its inferiority to the Y-O 
filter approaches but does not reach the 5 per 
cent level. 

From the / tests for time scores in Table 4 
it is seen that at the 0.01 foot-lambert level 
performance with the deep red filter is signif- 
icantly poorer than with the O-R or the Y-G 
filters but not significantly poorer than with 
the Y-O filter. This finding is in essential 


3 Detailed results for errors and times have been 
presented in the mimeographed report of this experi- 
ment (2) submitted to the Air Materiel Command. 


agreement with the error data reported above 
for the 0.01 foot-lambert level. 

A comparison of dial reading performan 
under colored and under white illumination 
(Mazda light at approximately 2,400°K.) has 
been attempted in Table 5. The values for 
colored light were arrived at by averaging the 
respective values for the four colors from Table 
2. The values for white illumination are from 
the previous study (3) and were arrived at by 
interpolating values from Table 1 of that study 


nce 


MEAN READING 
TIME (in seconds) 


MEAN ERROR 
FREQUENCY 


Y-G YO OR R 
COLOR OF ILLUMINATION 


a er 
Fic. 1. Time and error scores for dial reading, und t 
four colors of illumination. The white bars repres® t 


performance at 0.1 foot-lamberts, the shaded bars q 
0.01 foot-lamberts. 


Dial Reading Performance 


Table 3 


Values of £, Comparing Mean Number of Readings in 
Error for Four Colors, at Each 
Brightness Level 


199 


Table 5 


Comparing Average Dial Reading Performance under 
Colored and White (Mazda) Illumination 


0.1 Foot-lamberts Illumination, Mean ea 
in Foot- Errors, Dial 
Y-G Y-O O-R R lamberts asPerCent ¢ Seconds t 
Y-O 0.57 — Colored: 0.1 25.4% = 1.34 
O-R 0.22 0.44 = White: 0.1% 29.45 2:08 1.47 24 
R 1.76 3.04" 2.14" — 
Colored: 0.01 42.6 2.15 
0.01 Foot-lamberts White: 0.01% 64.5 ra ane a 
Y-0 ; = 
be bs an _ , mene from interpolation of values in (3, 
R 2.36* 1.76 2.36* = 


* Significant at 5% level. 
** Significant at 1% level. 


so as to provide estimates of performance at 
0.1 and 0.01 foot-lamberts. 

A series of ¿ tests showed that performance 
under colored illumination was significantly 
Superior to performance under white illumina- 
tion. The differences at 0.1 foot-lamberts for 
errors and time are significant beyond the 
06 and .03 levels respectively. At 0.01 foot- 
lamberts both differences are significant beyond 
the .001 level. 

These results show dial reading performance 
to be superior under colored illumination, with 
the superiority more marked at a very low 
photopic brightness level (0.01 foot-lamberts) 


Table 4 


Values of t, Comparing Mean Times to Read Dials 
under Four Colors, at Each 


Brightness Level 
0.1 Foot-lamberts 
o 0.26 — 
SR 2.91** 1.96 — 
E 1.40 0.85 1.18 — 
0.01 Foot-lamberts 
ro 1020 
PR 0.25 0.84 — 
5 2:11" 1.37 3350" = 


. a Mmo 


a Significant at 5% level. 
Significant at 1% level. 


than at a somewhat brighter level (0.1 foot- 
lamberts). This finding is consistent with the 
comment made at the beginning of this paper 
with respect to the advantages of mono- 
chromatic (or spectrally restricted) light, 
especially for small objects at low levels of 
illumination (1, pp. 242-243). 

A possible defect in the above comparison 
lies in the fact that all the colored light dial 
readings took place at least a month after the 
white light tests, hence our results may reflect 
merely a practice effect. However, analysis of 
the data of the previous experiment (3) 
revealed no evidence of a practice effect at the 
very stages of performance where practice 
effects, if significant, would be expected to be 
at a maximum. 

Certainly the conservative conclusion can 
be vigorously defended, namely that the use of 
colored instrument illumination (in order, for 
example, to preserve dark adaptation) will not 
impair dial reading performance as long as 
intensities are comparable. 


Discussion 


The data reported above for time and for 
errors show that at low photopic brightness 
G.e., below the 0.02 foot-lambert level which 
the previous study (3) showed to be critical) 
dial reading performance is poorer under deep 
red light than under the other three colors of 
illumination tested. At a brightness above 
this critical level the results were ambiguous 
with no clear relationship between color a 
illumination and performance. 


200 


The most important conclusion to be drawn 
from the results of this study is that in situa- 
tions where red light is used for dial reading and 
similar tasks brightness should be maintained 
above a critical value (about 0.02 foot-lamberts) 
in order that speed and accuracy of visual 
performance be not adversely affected. If 
brightness is allowed to fall below this critical 
value the decrement in performance will be 
most severe for red light. A brightness level 
of at least 0.1 foot-lamberts would seem to be 
a safe recommendation. 

Since with adequate intensity of illumination 
there was no consistent relation between 
performance and wave-length composition of 
the illumination it would appear that the 
lighting of instruments with red light, which 
is best for achieving and maintaining dark 
adaptation, would involve no sacrifice of 
instrument readability. 


Summary 


An experiment is reported on the speed and 
accuracy with which subjects can read photo- 
graphic reproductions of instrument dials as a 
function of the color of illumination provided. 

Young adult male subjects with excellent 
visual abilities read dials at two brightness 
levels, 0.1 and 0.01 foot-lamberts, under four 


S. D. S. Spragg and M. L. Rock 


colors of illumination: yellow-green, yellow- 
orange, orange-red, and deep red. The results 
showed that at 0.01 foot-lamberts performance 
was poorest under deep red light, while at 0.1 
foot-lamberts there was no clear trend in the 
results. The effects of color differences were 
minor compared to those due to brightness 
differences. 

These findings indicate the importance of 
keeping instrument brightness above the 
critical level (asserted to be about 0.02 foot- 
lamberts) especially when red lighting is used. 
They also indicate that if this critical bright- 
ness level is exceeded instrument readability 
js not decreased by the use of red light, 
which is known to be best for maintenance 0 
dark adaptation. 


Received A pril 1, 1952. 
Early publication. 


References 


Y Luckiesh, M. Light, vision and seeing. New York: 


D. Van Nostrand, 1944. ae 

2. Spragg, S. D. S., and Rock, M. L. Dial reading 
performance as related to illumination variables. 
II. Spectral distribution: USAF Memorandum 
Report No. MCREXD-694-21A, Air Materiel 
Command, December, 1948, 5 

3. Spragg, S. D. S., and Rock, M. L. Dial reading 
performance as a function of brightness. 
appl. Psychol., 1952, 36, 128-137. 


Dimensional Analysis of Motion: II. Travel-Distance Effects 1 


Robert Wehrkamp and Karl U. Smith 


University of Wisconsin 


New techniques of analysis of human 
manual motions (3) which permit the separate 
measurement of the travel and manipulative 
components of movements, have provided the 
means for extensive dimensional study of 
psychomotor activities used in work. A prior 
study (2) has dealt with the characteristics of 
travel movements and manipulative move- 
ments in relation to laterality and direction of 
motion. In this experiment, additional dimen- 
sional factors affecting movements have been 
investigated, especially type of manipulation, 
distance of travel of the motion, and effects of 
practice on the different fundamental com- 
ponents of the movement pattern. 


Experimental Methods 


It is the specific purpose of this experiment to 
determine to what extent the separate basic 
Components of human manual motions, i.e., 
the travel and manipulative aspects of these 
motions, vary as a function of the pattern of 
Manipulation and the extent of travel of the 
Manual movement. ‘Two patterns of manip- 
ulation have been compared, switch-turning 
and pin-lifting motions. The time character- 


istics of these two types of repetitive manipula- 


tions have been measured in relation to varla- 
uccessive 


tion in distance of travel between success) 
manipulations. At the same time, practice in 
these movements has been carried out over a 
Period of three days and the effects of learning 
on the travel and manipulative components of 

the motions determined. , 
The apparatus which permits analytical 
measurement of the different components of 
uman motion and also provides for quantita- 
tive determination of the effects of various 
Spatial, physical, and psychological dimensions 
of human manual movements has been named 
the “Universal Motion Analyzer.” A dia- 
1 Supported in pe by the funds received from the 


A Jis- 
Graduate Research Committee, The University ob We 
Consin, 


201 


grammatic sketch of the different elements of 
the apparatus, as used to accomplish the 
purposes of this experiment, is shown in 
Figure 1. 


The Universal Motion Analyzer is composed 
of two main components, the timing circuits 
and the planned work situation. The funda- 
mental principle of the timing mechanism is 
that the subject is included in the electronic 
relay circuits in such a way that his contact 
and release of successive work objects or con- 
trols starts and stops different clocks which 
measure separately the time of contact with 
the object or control and time required to 
travel from one object or control to another. 
By means of this arrangement, travel motions 
and manipulative motions are measured sepa- 
rately in hundredths of a second. 

The work situation used here consists of a 
large control panel 122.4 cm. long and 86.8 cm. 
high at the center. This panel contains 34 
turn-type switches and 34 pins arranged in the 
regular manner shown in the diagram. | The 
task of the subject is to turn successive switches 
to the right (34 turn) or lift the pins away 
from the panel for a distance of 1.5 cm. Both 
switches and pins are arranged to return to 
their original position after the manipulative 
motion is completed. 

In order to control and vary the distance of 
travel between successive manipulations, the 
arrangement of switches diagrammed in Fig- 
ure 2 was used. Three conditions of extent 
of travel were studied in the experiment. In 
the shortest travel distance, successive manipu- 
lations of each turn-switch were made in an 
up to down direction within the four center 
rows of switches of the panel. This condition 
will be referred to hereafter as Condition S, and 
involved 20 successive manipulations. The 
travel distance between each manipulation in 
this condition is 15.3 cm. A medium distance 
of travel of 30.6 cm. between successive switches 
was also used. This distance of travel will be 
called Condition M. The longest distance of 
travel used, to be named Condition L, involved 
a distance of travel of 61.2 cm. between suc- 
cessive switches. Examination of the diagram 
will show that each of the three variations in 
distance of travel involved three long return 
motions between rows of switches of approxi- 
mately 63 cm. It should be noted also, that 
in making the longer travel movements, the 


202 Robert Wehrkamp and Karl U. Smith 


Electronic 


va Relays 


Pin Control 
Turn Control 


See ee S eee | 


ee eee 


Universal Control 


Ternfnal Total Manipulation TEAST 
Panel seek Rote Apul 


; el S-1, Standard 
Fic. 1. The separate parts of the Universal Motion Analyzer. The clocks used are va Sh = fom 
Electric Time Clocks, The turn switches on the control panel are 12.0 cm. in diameter and protrude L EA the face 
the face of the panel. The pins are 3.2 cm, long and 0.5 cm. in diameter. These Pins lie flat again 
of the panel. 


© s— o +—0+— oe + 


A. Condition "S"; 29 Manipulations 


B. Condition "M': 12 Manipulations 


C. Condition "I": g Manipulations 


Fic. 2. The three conditions of travel movement used in the experiment, 


Dimensional Analysis of Motion: IT 


subject had to skip certain switches which were 
not hidden on the board. This factor had an 
influence on the results of the pin-lifting mo- 
tions which will be noted later. 

In making observations on pin-lifting move- 
ments, the same arrangements used in studying 
the switch-turning movements were used. 


The experimental plan is summarized in 
Table 1. The 42 college students who acted 
as subjects were divided into two main 
Subgroups. One of these subgroups operated 
the turn-switches, the other the pins. Each 
of the main subgroups was further divided into 
three task groups. Different task groups 
Performed the different distances of travel for 
the two types of manipulation. Each subject 
Carried out his respective three trials each day 
Over a period of 3 days. Travel time, manip- 
ulation time, and total time required in the 
Performance of each trial were recorded in 
hundredths of a second. Before beginning the 


Table 1 
The Experimental Plan 
Day 1 Day2 Day3 
Type of Distance Sub- : ; 
Manipu- of ject Trial Trial Trial 
lation Travel No. 123 456 789 
> inks ate, ete 
r=15.3cm— — 
Fe ee, eee ee) 
Turning [So sae See Se 
Ovements—|—30.6cm.—, — a 
oma 
RE ac os <xe 
—61,.2cm.—} — eye 
feos 
—15.3cem.— — Ape 
a a 
Puling a =~ aes e 
Ovement —-—30.6 cm.—| — ELT oo 
a) sas E 
—61.2em—] — tone LA 


203 


experiment, the subject was instructed care- 
fully in the performance of his specific task. 
In addition, a standardized seating procedure 
was followed to insure that each subject would 
be placed at the same relative distance from 
the work panel with respect to his arm length. 


Results 


The raw data for this experiment are the 
separate total travel-time scores and the 
manipulation-time scores for carrying out a 
given unit operation or trial for a given condi- 
tion of travel and type of manipulation. For 
presentation of the main results, the manipula- 
tion-time scores were divided through by the 
number of manipulations involved in a given 
condition in order to obtain a mean manipula- 
tion-time score. Travel time is generally 
treated in terms of the total time per trial. 

Separate analyses of variance were carried 
out for the travel-time data and the manipula- 
tion-time data. These analyses pointed up 
the following facts. All of the main effects 
studied in the experiment, i.e., type of manip- 
ulation, distance of travel and variations due 
to learning, were found to be significant at the 
5% level or beyond for both travel-time data 
and manipulation-time data. Subject varia- 
tion is, of course, significant for both aspects 
of movement. None of the _interactions, 
except that between type of manipulation and 
distance of travel for travel-time scores, were 
found to show significant variation in the study. 
Special note will be made later of the lack of 
significant interactions between conditions of 
movement and the effects of learning. 

Attention may be drawn to some of the main 
differences found in the experiment. In all 
of this description, it is desired to know how 
the separate travel-time and manipulation- 
time scores are affected by the variations in 
type of manipulation, distance of travel, and 
by practice. . 

A first main result of this study is that 
pattern of manipulation of simple grasping and 

2 The raw data of this experiment and the summaries 
of analyses of variance have been filed with the Ameri- 
can Documentation Institute. Order Document 3419 
from American Documentation Institute, 1719 N Street, 
N.W., Washington 6, D. C., remitting $1.00 for micro- 
film (images 1 inch high on standard 35 mm. motion 
picture film) or $1.35 for photocopies (6 X 8 inches) 
readable without optical aid. 


204 


turning differs significantly from the movement 
of grasping and pulling as far as the time of 
manipulation is concerned. Pulling move- 
ments also produce a significantly longer travel 
time in successive motions. The manipulation 
time for the pulling motions is about 53 per cent 
greater than that for the turning motion. 
Travel time for pulling motions is about 32 per 
cent greater than that for turning motions. 

Varying the distance of travel between 
successive motions influences both the travel 
time and manipulation time of the motion. 
Both of these components of movement are 
increased in duration as the travel distance is 
increased. Increasing the travel distance of 
successive motions four times, from approx- 
imately 15 cm. to 60 cm., produces an increase 
in duration of each manipulation by 30%. 
The increase in travel time per movement, due 
to a simular increase in travel distance, 
amounts to 78 per cent. 


Time 
1/100 Seconds 


3 


22 
Condition 
ngh 


Fic. 3. The change in manipulation time for each manipulation 


Condition 


Robert Wehrkamp and Karl U. Smith 


Figure 3 illustrates graphically the unex- 
pected increase in manipulation time produced 
in both turn- and pull-motions by increasing 
the distance of travel between successive 
motions. The relatively large values observed 
in the case of the pulling motions, for Condition 
M, that is, a travel distance of 30.6 cm., may 
be the result in part of a search factor. When 
operating the pins of the control panel under 
this condition, subjects were sometimes ob- 
served to make errors in the pin selected. 

As already noted, the effects of practice on 
the basic components of travel and manipula- 
tion in work motions have been measured ae 
this experiment. These two components ` 
motion, as illustrated in Figure 4, are anes 
differently by practice and learning. Trave 
motions show relatively very little change as & 
result of practice. Manipulative components 
of motion show a 32 per cent decrease in time 
as a result of practice, as compared to ® 


Condition 
ngn 


"yp 


] i f as a function of 
increasing the travel-distance between manipulation: 


S. 


| 


Dimensional Analysis of Motion: II 
205 


Time 
1/100 Seconds 


50 


Lo 


30 


Fic, 4. Change in manipul: 
as a function 0 


or the travel aspect of the 
motions. These figures are for days of practice. 
Between the first and last trials, manipulative 
motions decreased in time by 55 per cent, and 
the travel component of the motion pattern 
decreased by 22 per cent. 
„Tt has been mentione 
Significant interactions Wer 

Practice effects and other conditions of the 
experiment, either pattern of manipulation 
to distance of travel- This statement applies 
or both the travel and the manipulative 
Components of the movement pattern. Ac- 
cordingly, the results of this study are that the 
effects of practice do not alter the fundamental 
differences in the characteristics of motion as 
these characteristics are defined by the other 
conditions of movement brought under investi- 


gation in the study- 


12 per cent decrease fi 


d already that no 
e found between 


ation tim 
f practice over three days. 


Om m mO travel time 
QuemeeQ® UI pulation time 


e and travel time per movement 


Summary and Discussion 


The utilization of new techniques for 
dimensional analysis of human work motions 
has been described. These methods, which 
employ improved principles of both measure- 
ment of components of motion and pre-plan 
ning of the working task to achieve cord ti 
experimental design, have been applied to a 
study of the effects of type of manipulati he 
Se of Spr and practice on the eee 
ulative and travel com a 
patterns. pe 0k maato 

Tt has been found that in the performance of 
the actual manipulation, simple grasp and 
turn movements of the hand are performed 
53 per cent faster than simple grasp and pull 
movements. Furthermore, during enon 
motion, the travel time for the turn movements 
is about 32 per cent faster than that for the 
pull movements. The pattern of a manipula- 


206 


tion defines not only the time of muscular 
response in the manipulation but also the time 
of muscular response in travel between suc- 
cessive motions. 

Extending the distance of travel between 
successive motions produces an increase in 
time of both the travel component of the 
motion and that of the manipulation involved. 
Thus, it has been shown that pattern of 
manipulation and distance of travel have 
integrative effects related to all of the com- 
ponents of the working task and do not affect 
alone the component of movement primarily 
related to these dimensional variations. 

Barnes (1) has reported that extending the 
distance of travel of a motion does not affect 
the efficiency or time of a motion. This 
observation is contradicted by the present 
results, which show that both the manipulative 
and travel aspects of motion may be increased 
in duration by as much as fifty to seventy-five 
per cent by a fourfold increase in the distance 
of the travel component of the motion. 

The role of learning in defining the organiza- 
tion and efficiency of different basic com- 
ponents of movements in the working task 
has not been investigated prior to this study. 
The present results show that travel aspects 
of motion are relatively less affected by learning 
than the manipulative reactions in a working 
movement. In fact, the magnitude of change 
in the travel component of motion during 
practice is only about one-third as great as 
that found for the manipulative component of 
the same motor pattern. This result poses a 
real problem for understanding the role of 


Robert Wehrkamp and Karl U. Smith 


learning in psychomotor skill inasmuch as it 
is the travel elements of a complex movement 
pattern which define the over-all organization 
of the pattern. 

Another noteworthy point about the role of 
learning in psychomotor skill has been brought 
out by the present study. Learning effects 
do not interact significantly with the effects of 
the pattern of manipulation and with distance 
of travel of a motion. This fact seems to 
suggest that learning and practice do not 
contribute significantly to motor organization 
in simple work patterns of movement of the 
sort investigated here. , 

The present study has pointed up significant 
differences of varying magnitudes in different 
aspects of work motions as a function o; 
three main conditions of work. The practical 
significance of these differences is established 
through the fact that the manifold repetition 
of the single patterns of movement to accom- 
plish a daily task represent an accu mulation of 
time of great importance both economically 
and psychologically in industry. 


Received July 11, 1951. 


References 


1. Barnes, R. M. Motion and time study (3rd edition). 
John Wiley and Sons, 1949. 

2. Davis, R. T., Wehrkamp, R. F., and Smith, K. U- 
Dimensional analysis of motion: I. Effects s 
laterality and movement direction. J. appe 
Psychol., 1951, 35, 363-366. r Í 

3. Smith, K. U., and Wehrkamp, R. F. A univers 
motion analyzer applied to psychomotor P® 
formance. Science, 1951, 113, 242-244. 


A Multiple Factor Analysis of Advertising Readership 


Dik Warren Twedt 


Northwestern University 


One of the most widely used indices of the 
attention value of published advertisements is 
the extent to which people read and remember, 
as determined by readership recognition sur- 
veys of the Gallup, Starch, and Advertising 
Research Foundation type.! In these surveys, 
a representative sample of a publication’s 
circulation (usually from 200 to 400 subjects) 
is interviewed shortly after publication of the 
survey issue. Working with a whole copy of 
the issue (or an abbreviated issue if the original 
is so large as to cause fatigue during the inter- 
view), the interviewer goes through the issue 
page-by-page, recording the elements which the 
respondent says he has read. The resulting 
readership scores are simply percentages of 
readers who report having read a particular 
article or advertisement. 

The present analysis is based primarily upon 
the Advertising Research Foundation’s Con- 
tinuing Studies of business magazines (2,3, 4, 5) 
for these reasons: 


1. They are the most recent studies pub- 
lished by ARF, and have advantages of certain 
technical refinements such as Lucas’ confusion- 
control (11), the elimination of respondents 
who identify more than a critical number of 
advertisements or articles which have never 


been published. 

2. Business papers W. 
Audit Bureau of Circulat 
publications represented 
azine studies) are requil 
subscribers by occupation a 


hich are members of the 
ions (as are all of the 
in the business mag- 
red to classify their 
nd geographical 


* Basi the present analysis are taken from 
the pacing Research Foundation’s Continuing 
Studies of Readership. The ARF, a non-profit organi 
zation sponsored jointly by the American Associa 
of eo avertining Agencies and the Associa! 

vertisers, has as its purpos x y 
effectiveness in erne through impartial research. 

he Foundation makes newspaper, pen 
Portation advertising, business magazine, an 
management publication readership studies. 
inception in 1936, the Foundation has pu 
Surveys of nine media in 146 markets throu 
United States and Canada (7). 


ghout the 


area (see paragraph 10 of the biannual pub- 
lisher’s statement, available from the Audit 
Bureau of Circulations or from the publisher). 

3. It is reasonable to assume that the popula- 
tion of readers of a business magazine such as 
American Builder is more homogeneous with 
respect to interest in -business advertising, 
than are readers of general media with respect 
to consumer advertising. 

4. In measuring readership of consumer 
advertising in general magazines, it is difficult 
to partial out the cumulative impression of 
advertising in other media such as radio, 
television, billboards, etc. This problem is 
also present in business paper advertising, but 
to a considerably lesser degree. 


Purpose of the Analysis 
This analysis has a threefold purpose: 


1. To define and measure certain variables 
in business magazine advertising, and deter- 
mine the interrelations among these variables, 
and their relation to readership as measured by 
the ARF recognition surveys. 

2. To determine the factorial structure of the 
relationships among these variables, so as to 
make possible a simpler psychological explana- 
tion of the obtained variance in readership 
scores of advertisements. 

3. To develop a multiple-regression equation 
which will predict advertising readership in 
business papers. 

This analysis is thus one of audience (or what 
people do to the advertisements) rather than 
one of effect (what the advertisements do to 
people). The same general experimental and 
statistical approach is also applicable to 
studies of advertising effect, the only stipula- 
tion being that an adequate effect criterion 
must first be available. 

The experimental design employed in this 
study is intended to uncover general principles 
of advertising which will increase the probability 


207 


208 Dik Warren Twedl 


that prospects will be exposed to a given sales 
message. Because of the complex nature of 
the problem—the many variables which may 
influence readership both directly and through 
interaction with other confounding variables— 
and particularly because of the expense and 
difficulty of controlled, single-variable exper- 
imentation in a practical advertising situation, 
it is not easy to evaluate the relative import- 
ance of variables contributing to variance in 
readership scores. Comparison of high-scoring 
advertisements with low-scoring advertise- 
ments is helpful, but this does not represent 
the most powerful statistical technique avail- 
able. And we do need statistical controls; 
even where large numbers of observations are 
available, categorizing the data by such 
pertinent variables as ‘size and color may 
reduce the number of cases so greatly that 
conclusions based upon them are not likely 
to be stable. 

Fortunately there is an exploratory method 
(multiple factor analysis) which is well suited 
to the Continuing Studies of readership data. 
The basic assumption of the factorial method is 
that there is an underlying order which, when 
found, will permit us to give a simpler explana- 
tion of phenomena which may seem to be the 
result of a very large number of variables. 
The method begins with a table of inter- 
correlations, or correlation matrix (see Table 
2). From this matrix we attempt to get 
simplified explanations or factors for the 
observed correlations (see Table 3). 


Procedure 


A preliminary analysis was made of the ARF 
Continuing Study of Business Papers No. 2 
(3), on the American Builder, a monthly 
trade magazine edited primarily for building 
contractors and dealers. At the time of the 
February, 1950 issue, its circulation 


5 was 
approximately 80,000 (13). The American 
Builder was chosen for this analysis principally 


because of willing cooperation from the 
magazine’s publisher and research manager. 
This magazine averages more than 300 pages 
to an issue. The survey issue of February 
1950 contained 320 pages, of which only 188 
were included in the restapled intervi 


X ‘ ewing 
opies. The abbreviated survey issue con. 


tained 137 advertisements of varying sizes, 
ranging from § page to + pages. In advertise- 
ments } page or larger (N = 122), the foilowing 
readership percentages are available: (1) “Any 
This Ad,” per cent who remembered reading 
or seeing any part of the advertisement; (2) 
“Headline,” per cent who remembered reading 
the principal headline of the advertisement; 
(3) “Any Copy,” per cent who remembered 
reading any of the advertising copy, exclusive 
of the headlines; (4) “Pictures,” per cent who 
remembered seeing the picture indicated. 
For advertisements smaller than } page, only 
one readership percentage, “Any This Ad,” is 
given. For all advertisements } page oF 
larger, “Any This Ad” readership percentages 
correlated .98 with “Pictures,” .91 with “Any 
Copy,” and .90 with readership of “Headlines. , 
The more inclusive category, “Any This Ad, 

was chosen as the criterion measure of reader- 
ship. 2 

Against this criterion, product-moment rs 
were computed for 34 advertising variables 
(see Table 1). Mechanical variables are listed 
as items 1 through 15 in Table 1, and content 
variables are listed as items 16 through 34 
Detailed definitions of each variable have been 
deposited with ADI, from which microfilme 
copies are available at nominal cost.? 

The correlations of .00 and .01 between 
readership and Flesch readability indices a , 
16) were not statistical artifacts due to restric- 
tion in range of Flesch scores, but they may be 
a function of high specialization of interest by 
technical audiences. 

Of the 34 variables which were correlated 
with the readership criterion, 19 variables 
were selected on the basis of significant 
correlation with the criterion, In Table 1 
variables 6, 23, 25, 32, and 33 were not include 
in the correlation matrix because they were nO 
independent of other variables which wee 
included. Variable 28 was not include 
because of low reliability of judges. Means 
and standard deviations are included in Table 


* Detailed definitions have been deposited with He 
American Documentation Institute. Order Docum® 4 
3417 from American Documentation Institute, 171 r 
St., N.W., Washington 6, D. C., remitting $1.00 f° 
microfilm (images 1 inch high on standard 15 my 
motion picture film) or $1.00 for photocopies (6 x 
inches) readable without optical aid. 


<7 


Multiple Factor Analysis of Advertising Readership 


Table 1 


Correlations with Readership, Means, and Standard Deviations of 34 Advertising Variables 


T 
bruary, 1950, N = 137 unless 0 


of 


an 


Sect, 
Co, 


Variable r M m 
K. Criterion (Per Cent Readership)* = 26.7 16.6 
Mechanical Variables 
1. Number of pages or size of advertisement 627 6 6 
2. Width-height ratio of advertisement 5 6 3 
3. Number of colors 37} 2.3 3 
4, Number of separate illustrations -28t 2.7 39 
5. Square inches of illustration -67F 12.4 175 
6. Proportion of illustration Eyi 24.3 16.4 
7. Number of type styles -06 4.9 16 
8. Number of type sizes -28t 5.6 16 
9, Point size of largest type ADF 38.9 23.1 
10. Point size of headlines (wtd. avg.) 3} 20.7 121 
11. Largest type: product identification 39F 27.7 19.2 
12. Point size of main body copy 24} 7.9 18 
13. Pica width of copy measure (wtd. avg.) Sof 14.6 55 
14. Number of copy blocks -30t 6.7 40 
15. Layout deviation (+) from 90° —.13f 12.3 6.7 
Content Variables 
16. Flesch readability scores .00 46.9 10.7 
17. Flesch abstraction level scores 01 27.3 re 
18. Number of words in advertisement ait 169.3 162.0 
19. Number of words in headlines 10 11.3 13.8 
20. Number of product identifications 40t 6.8 53 
21. Number of product facts 19} 16.5 11.5 
22. Number of product benefits .29ł 11.6 8.9 
23. Number of pictorial benefits AS 1.6 2.2 
24, Number of benefits in headlines —.05 17 1.8 
25. Number of benefits in body copy 28 SiG 6.7 
26. Number of pictures of product in use 33t 2.3 1.7 
27. Directions for getting more details = 2 1 
28. News value ratings £ i y 
29, Readership of surround a a wig 
30. Number of similar ads in issue Tii 50 i 
31. Previous schedule: 1/50 + 1949 ‘45 9.2 a 
32. Previous schedule: 1/50 + 1949-48 45 131 177 
33. Previous schedule: 1/50 + 1949-48-47 ast ape Pose 
34, Brad-Vern totals : ` 


* oy A 
PS based on advertisements 


ignificant z’s which are included in 
Based on 57 full-page adverti 
ased on 34 advertisements. 


1 
to provide the reader with some 


ing distributions from which the ¢ 
obtained. The units in W 
7 S are expressed are describe 
On of this paper. Product-m 


Yr 3 
Clations were computed for 


in Continuin: 
therwise ind 


icated. 
the 2 


sements. 


knowledge 
orrelations 


hich the means 
d in the ¢ 
oment inter- 


these 


g Studies of Business Papers No. 2, American Builder, issue of 


0 X 20 correlation matrix. 


variables plus the readership criterion, and 


incorporated in a 20 X 20 
(Table 2): 


correlation matrix 
In general, the variables are 


ositively correlated. 
The correlation matrix was factor analyzed 


with Thur: 


stone’s complete centroid method 


Variable 


K Readership 
2 Size of Ad 
3 Number of Colors 
4 Number Illustrations 
5 Sq. In. Illustration 
8 Number Type Sizes 
9 Largest Type 
10 Headline Size 
11 Largest Prod. Ident. 
12 Body Type Size 
13 Pica Width 
14 Number Copy Blocks 
18 Number Words 
20 Number Prod. Ident. 
21 Number Prod. Facts 


22 Number Prod. Benefits 


26 Pictures of Use 

29 Surround 

31 Previous Schedule 
34 Brad-Vern Schedule 


40 
19 
29 


23 


Table 2 
Correlation Matrix of Product-Moment r’s for 20 Advertising Variables * 
2 3 4 5 8 9 10 it 12 13 14 18 20 21 22 26 29 31 
—07 
35 24 
71 21 25 
54 21 22 27 
64 32 21 45 54 
49 26 23 31 25 55 
41 28 04 33 41 66 26 
19 18 —07 17 25 23 25 25 
25 28 —11 28 24 24 25 22 49 
57 20 54 23 40 29 22 14 =05 —01 
62 11 24 24 43 32 23 24 —03 12 61 
61 18 33 36 —06 37 24 30 05 16 62 62 
30 —02 25 26 16 19 —02 08 —03 —17 38 49 35 
51 11 30 22 37 31 16 20 —08 00 59 68 62 54 
29 14 52 32 14 13 —02 08 —04 —13 45 16 29 35 37 
49 —11 12 38 20 44 45 15 24 22 —04 00 20 -38 —19 —17 
53 16 20 44 17 27 46 14 02 29 26 39 40 18 34 06 —18 
22 01 04 21 08 06 15 03 —04 04 09 19 17 02 22 10 —17 50 


* Decimals omitted. 


otz 


pan], UID AY JIA 


Table 3 
Factor Loadings and Communalities of 20 Advertising Variables * 

Centroid Loadings Rotated Loadings 

Variable 1 II Til IV Vy VI 12, PC Ko T İn F 

K Readership 74 —18 —21 -18 0 —12 6773 64 35 28 18 16 
5 Sq. In. Illustration 68 =i —-10 —13 14 —30 6158 51 48 25 06 23 
26 Pictures of Use 38 37 11 —50 05  —10 5559 51 23 —18 09 10 
3 Number of Colors 3 —25 -17 31-15 13 3358 49 —07 23 11  —15 
1 Size of Ad 87 07 18 30 27 —10 9827 18 69 26 45 45 
21 Number Prod. Facts 35 48 15 =17] -2 —07 4533 20 37 —24 28 —12 
34 Brad-Vern Schedules 26 19 —39 19 02 —27 3652 21 37 —25 —01 07 
9 Largest Type 71 —36 29 18 00 —06 7538 15 45 62 36 12 
11 Largest Prod. Ident. 51 —33 16 08 —14 —09 4287 18 34 48 21 —06 
8 Number Type Sizes 55 868 46 11 —28 07 6419 01 37 47 47 —16 
12 Body Type Size 26 —47 —09 —04 -02 04 3002 12 04 46 09 os 

10 Headline Size J à 2 =W 18 06 04 4230 26 24 34 30 1 
18 Number Words 64 38 15 30 —18 17 7278 05 46 —10 71 Ma 
14 Number Copy Blocks 62 43 21 —08 —04 27 6943 25 27 —13 o o 
20 Number Prod. Ident. 66 37 —il 14 04 25 6683 31 29 —20 = whe 
22 Number Prod. Benefits 60 59 13 14 -2 04 6965 14 52 = a 05 
29 Surround 24 —54 18 20 75 16 1.0097 —03 —13 ' A i 
31 Previous Schedule 55 16 —53 25 Oe, “=19 7092 42 a m y me 
ica Width 34 =a — 33 08 = —07 11 4415 35 —0 i 5 

E aa Tilust. 46 27 15 —34 18 07 4599 36 17 —05 27 25 

u ; 


Gussapoay 3usyoapy fo sistpoupy 4004 2dr jy 


* Decimals omitted. Boldface indicates factor on which each variable has its highest loading. 


TIZ 


212 


(15, Ch. VIII). The resulting centroid matrix 
is shown in Table 3. Extraction was stopped 
with the sixth factor, since the product of the 
two highest loadings in factor VI(.27 X .30 
= .08) is only equal to the standard error of 
the original r between these two variables. 

In order to give psychological meaning to 
these factor loadings, the arbitrary reference 
frame obtained by the centroid method was 
rotated by the graphical method. The result- 
ing factors are orthogonal. Criteria of positive 
manifold and simple structure were observed 
wherever possible. Since most of the correla- 
tion coefficients are positive, it might be 
expected that a positive manifold could be 
obtained, and this actually was achieved with 
only a few exceptions. 


Interpretation of Factors 


In Table 3, boldface figures indicate the 
factor on which each test variable has its high- 
est loading (a factor loading is the correlation 
between that test and that factor). Coeffi- 
cients of determination (the squared factor 
loadings) give the percentage of variance of a 
given measurement which may be predicted 
by a particular factor. For example, the 
Readership variable has a loading of .64 on 
factor PC; thus .642 = 41, or 41 percent of 
the variance in readership scores may be 
predicted from this single factor. Note that 
only two of the factors (Pictorial-Color and 
Size) have major loadings on Readership. 
Factor loadings below .20 are not usually 
considered significant; loadings between .30 
and .40 may be important; if the projections 
are .40 or above, the loadings are considered 
significant, 

Factor PC has high positive loadings on 
Readership (.64), Square inches of illustration 
(.51), Number of pictures showing the product 
in use (.51), Number of colors (.49), and 
Previous schedule of advertising (42). The 
best measures of this factor are tho: 
Pictorial and Color aspects of ady 
hence the factor designation PC, 

Factor S has high loadings on Ad size (.69) 
Number of product benefits (.52), Square 
inches of illustration (.48), Number of words 
(.46), Previous schedule of advertising (.46), 
and Largest type size (45), Readership 


se involving 
ertisements, 


Dik Warren Twedt 


loading on factor S is .35, or 12 per cent of the 
variance in readership scores is attributable 
to this factor, which seems to involve Size of 
advertisement. 

Factor T has high loadings for Largest type 
size (.62), Readership of surround (.61), 
Largest type used for product identification 
(.48), Number of type sizes (.47), and Point 
size of main body copy (.46). In general, this 
factor seems to be associated with Typographic 
size and variety. Its Readership loading is 
-28, accounting for 8 per cent of readership 
variance. 

Factor In has high loadings for Number of 
words (.71), Number of copy blocks (.65), 
Number of product identifications (.64), Num- 
ber of product benefits (.57), Number of type 
sizes (.47), and Ad size (.45). This factor 
appears to be one of Information, and its load- 
ing on Readership is .18, accounting for only 3 
per cent of readership variance. 

Factor F has only two significant loadings: 
Readership of surround (.76) and Ad size 
(.45). The factor designation F is for Field— 
the influence of the surrounding field or 
background against which the advertisement 
is seen. Another 3 per cent of readership 
variance is accounted for by this factor, which 
has a Readership loading of .16. 

Factor A has significant loadings for Previous 
schedule (.47), Number of pictures of product 
in use (—.44), Pica width of copy measure 
(43), and Number of illustrations (—.40)- 
This factor is difficult to interpret, but tenta- 
tively it is called A, for Advertising schedule 
previously run. It accounts for less than 
per cent of readership variance; the criterion 
loading is .09. 

An important conclusion is that collectively 
these six factors account for two-thirds (Pr 
= .6766) of the observed variance in readership 
scores of advertisements appearing in the 
February, 1950 issue of American Builder. 
PC alone accounts for 41 per cent of the 
variance in readership scores; PC and Í 
together account for 53 per cent of the variance- 


Prediction of Readership from Multiple 
Regression Equations 


On the basis of the factor analysis, certain 
variables were chosen which seemed to Þe 


Multiple Factor Analysis of Advertising Readership 


Table 4 


Correlation Matrix of Three Advertising Variables 


Square 
K Inches of 
Variable Size Colors Illustration 
Size of advertisement +.07 Bi 
Number of colors —.07 21 
Square inches of 
illustration -71 .21 


factorially purest, and which also offered most 
promise for prediction of advertising reader- 
ship. Several combinations of these variables 
were tried in multiple regression equations, 
and the following set of three (see Table 4) 
was selected as providing maximum prediction 
with minimum trouble of measurement. 

Ri.234 = .77 (where 1 = Predicted reader- 
ship; 2 = Size of advertisement; 3 = Number 
of colors; 4 = Square inches of illustration). 
Correction for bias gives a shrunken R of .76. 
When nine variables (numbers 1, 3, 11, 20, 21, 
22, 29, 31, 34) were incorporated in a regression 
equation? R = .79. The gain of .03 is 
obviously not worth the time involved in 
making these additional measurements. 

The best comparison of each variable’s 
Contribution to the variance in readership is 
found in column (4) of Table 5, where each beta 
Weight is multiplied by the corresponding rawr. 

he regression coefficients, or optimal weights 
Y which each variable must be multiplied 
to obtain a maximum multiple R, are given in 
column (5), 
he regression formula computed from the 
“Nerican Builder data is: 


sil 10.456 + 8.293 (Size of veh = 
+ 3.869 (Number of colors f 
tr 481 (Square inches of illustration), 


w i 
a here X’ = predicted readership, 
ti “orrection for point of origin. 


10; i 
> "al Purposes, of course, this cons 
Clim; i 


and 10.456 is 
For correla- 
tant may be 


z is also 
unated from the computations. ta this 
fo vious that prediction of readership i her 
the Wa establishes relative gee h are 

absolute readership scores, WMC 
a i f Kelley 
High™ndike’s (14, p. 340) adaptation of he tad 
these WY iterative ‘solution for R greatly t 
©mputations, 


213 


Table 5 


Correlations with Readership, 8 Weights, 8r Cross 
Products, and Regression Coefficients of 
Three Advertising Variables 


(1) (2) (3) (4) (5) 
Variable fik B r bik 
Size of advertisement 62 -441 .273 8.293 
Number of colors 37 341 126 3.869 
Square inches of 
illustration -67 +285 -191 181 


dependent upon the general readership level 
of a particular magazine. 

In order to minimize the Possibility of 
computational error in the calculation of R 
and the appropriate regression coefficients, a 
product-moment correlation was computed 
between actual readership scores of the 137 
advertisements in the American Builder study, 
and predicted readership scores of these 
advertisements, based upon the regression 
weights given in Table 5, column (5). The 
correlation coefficient was .76—agreeing with 
the shrunken R of .76. 

The critical point of the study is now at 
hand: the factorial approach proved fruitful 
with the American Builder data, but what is 
the strength of the relationship between the 
mechanical variables of size, color, and amount 
of illustration and readership of advertising in 
other business magazines? Table 6 shows the 
product-moment correlation coefficients be- 
tween actual readership scores of advertise- 


Table 6 


Correlations Between Readership Scores of Advertise- 
ments in ARF Studies, and Readership Scores 
Predicted from the Regression Formula 


Number of 


Magazine Surveyed r Advertisements 
Automotive Industries 58 131 
American Builder 6 137 
American Machinist 63 161 
Chemical Engineering 64 133 
Business Week 80 101 
Successful Farming: 

Men readers 17 217 
Women readers 73 217 


214 


ments in other ARF studies (1, 2, 3, 4, 5, 6) 
and readership scores predicted from the 
regression formula. 

The mean r for the four business magazine 
studies is .66 (obtained by Fisher’s z trans- 
formation). When Business Week (6) and 
Successful Farming (1) are included with the 
business magazine studies, the resulting mean 
ris .71. 


Discussion of Results 


The three possible sources of variance in 
readership of advertisements are: (1) differ- 
ences in the attention-getting power of the 
advertisements; (2) differences in respondents’ 
interests and purchasing readiness; and (3) 
chance errors in measurement.t The present 
study is concerned only with readership 
variance attributable to differences in the 
advertisements, whether these differences are 
mechanical (size, color, illustration, etc.) or 
differences in content (number of facts, benefits, 
ete). 

When this analysis was begun, a possible 
outcome was that only a small part of the 
differences in readership might be accounted 
for by mechanical variables. A recent evalua- 
tion of the importance of content as against 
mechanical variables has been given by James 
D. Woolf (17, p. 43), formerly vice-president of 
the J. Walter Thompson advertising agency, 
who stated: 

“It is my conviction that it isn’t the size of the 
Space that puts PULL into an advertisement. 
At least size is not the most vital consideration. 
Dr. Samuel Johnson said two centuries ago that 
the soul of the advertisement is the size of the 
promise.’ In other words, the size of the 
promised benefits, 


4 This trichotomy is somewhat oversimplified: 
possibility also exists that for two any Er EE R ee 
A might have greater immediate attention value than 
B, and yet B might be remembered more readily than 
A several days after S’s original exposure toA aid B 
Thus A might be said to have greater attention value. 
but B greater memorability. In the Present study, these 
two variables (if they actually do exist independent] 5 
are confounded and their effects cannot be measured 
separately. 


Dik Warren Twedt 


in the eyes of the reader. But they do not 
shape public opinion for a product when and if 
Dr. Johnson’s ‘soul of the advertisement’ is not 
in the copy. Huge size and red ink and thun- 
derous pitch and clamor are not substitutes for 
promised benefits. . . .” ne oh 

The italics are Woolf’s. This position is 
clearly not supported by the findings of the 
present study. It should be remembered, 
however, that Woolf may refer to consumer 
advertising, and that he may exclude industrial 
advertising from consideration (although he 
does not make this distinction explicit in the 
article from which this quotation is taken), 
and secondly, his undefined “PULL” may 
refer to effect of the advertisement (which this 
study clearly does not attempt to predict) 
rather than to the size of its audience. 

Lucas and Britt (12, p. 289) also stress the 
importance of content: “... the primary 
element in the success of all advertising copy 
is its content or substance. Most of the other 
factors are merely devices for making the 
subject matter more visible, more palatable, 
and easier to comprehend.” 

In an earlier experiment which was designed 
to measure the influence of mechanical vari- 
ables on readership of newspaper advertise- 
ments, Ferguson (9) concluded, “Contrary to 
popular and scientific belief it was found that 
there was no relationship between the size of an 
advertisement and its attention value.” 
Ferguson’s data were based on readership of 
a small daily newspaper, and again there may 
be real differences between readership of 
industrial advertising in business magazines, 
and consumer advertising in small daily 
newspapers. 

Although the present conclusions as to the 
importance of the mechanical variables of size, 
color, and illustration are based primarily upon 
industrial advertising in business magazines, 
it is suggestive that the highest relationship 
between readership scores, and readership 5 


predicted by the regression formula, was for 


Business Week, an executive management 
publication which is somewhere in between the 
business magazine edited for a particular 
industry or occupation, and the general 
magazine with almost universal appeal. 
Unless this r of .80 (Table 6) represents only 2 
vagary of sampling, it is reasonable to assume 


r3 


j 


Multiple Factor Analysis of Advertising Readership 


that the regression weights given in Table 5, 
column (5) may prove useful in predicting the 
relative readership of advertisements in general 
magazines. The Successful Farming study, 
with r’s in the .70’s (Table 6), also supports 
this assumption. 


Summary 


Thirty-four advertising variables were de- 
fined, measured, and correlated with reader- 
ship scores for 137 advertisements in the 
February, 1950 issue of the American Builder, 
a business magazine published primarily for 
building contractors. Criterion scores were 
obtained from the Advertising Research Foun- 
dation’s Continuing Studies of business mag- 
azine readership. 

Of these 34 variables, 19 were selected as 
most significantly correlated with the criterion. 
Product-moment intercorrelations were com- 
puted for these 19 variables plus the readership 
scores, and the resulting 20 X 20 correlation 
matrix was factor analyzed. 

Six factors were found to be sufficient to 
account for the intercorrelations. Of these 
six factors, only two, PC (Pictorial-Color) and 
S (Size) have major loadings on Readership. 
The other factors are T (Typographic size and 
variety), In (Informational), F (Field factor, 
or the influence of the surrounding field of the 
advertisement), and A (Advertising schedule 
Previously run). Collectively, these six factors 
account for two-thirds of the observed variance 
™ readership scores of the advertisements. 
The PC and S factors alone account for 53 
Per cent of the variance in readership. , 

On the basis of the factor analysis, certain 
Variables were chosen which seemed factorially 

West, and a multiple regression equation 

was developed to predict readership of adver- 

‘Sements in other business magazines. 

multiple R of .77 was obtained boen 

Padership and the following group of eal 
of advertisement, number of colors, 


are § F : 
© inches of illustration. Joyed to 


€ regression equation was employ! 
i ; ix 
Predict readership of advertisements m § 


215 


other Advertising Research Foundation studies. 
Predicted readership scores were correlated 
with actual readership scores, and these 
validity coefficients ranged from .58 to .80, with 
an average r of .71. 


Received July 20, 1951. 


References 


1. Advertising Research Foundation. Continuing 
study of farm publications: No. 3, Successful 
Farming, issue of May, 1947. New York: ARF, 

2. Advertising Research Foundation. Continuing 
study of business papers: No. 1, Automotive In- 
dustries, issue of October 15, 1948. New York: 
ARF. i 

3. Advertising Research Foundation. Continuing 
study of business papers: No. 2, American Builder, 
issue of February, 1950. New York: ARF, i 

4. Advertising Research Foundation. Continuing 
study of business papers: No, 3, American Ma- 
chinist, issue of March 6, 1950. New York: 
ARF 

5. Advertising Research Foundation. Continuing 
study of business papers: No. 4, Chemical Engi- 
neering, issue of March, 1950. New York: ARF, 

6. Advertising Research Foundation. Study of execu- 
tive management publications: No. 1, Business 
Week, issue of April 22, 1950. 

7. Advertising Research Foundation, The Advertis- 
ing Research Foundation: what it is—what it 
does. New York: ARF, 1951. 

8. Brad-Vern’s Reports. New York: Printers’ Ink 
Publishing Company, 1950. 

9. Ferguson, L. W. The importance of the mechani- 
cal features of an advertisement. J. appl. Psy- 
chol., 1935, 19, 521-526. 

10. Flesch, R. Measuring the level of abstraction. 
J. appl. Psychol., 1950, 34, 384-390, 

11. Lucas, D. B. A rigid technique for measuring the 
impression values of specific magazine advertise- 
ments. J. appl. Psychol., 1940, 24, 778-790. 

12. Lucas, D. B., and Britt, S. H. Advertising psy- 
chology and research. New York: McGraw-Hill, 
1950. 

13. Standard Rate & Data Service, Business publica- 
tion section, 1950, 32, 116. Chicago. 

14. Thorndike, R. L. Personnel selection. New York: 
John Wiley & Sons, 1949. 

15. Thurstone, L. L. Multiple-factor analysis. Chi- 
cago: University of Chicago Press, 1947. 

16. Twedt, D. W. A table for use with Flesch’s level 
of abstraction readability formula. J. appl. 
Psychol., 1951, 35, 157-159. 

17. Woolf, J. D. It isn’t size that puts pull in adver- 
tising. Advertising Age, April 30, 1951, 43. 


Best Sellers Among Popular Psychology Books 


Garry R. Austin 


Counseling and Examinations, M: ichigan State College 


Workers in a number of related disciplines— 
adult education, librarianship, social psychol- 
ogy, communication—have in more recent 
decades interested themselves in the popular 
Propagation of knowledge through books. 
Among these studies of popular culture, those 
relating to popular psychology books are 
significant. 

Among previous investigations attesting 
somewhat incidentally to the relative import- 
ance of the psychology or “self help” theme is 
a survey conducted by Gallup, on “The 
Favorite Books of Americans” (1), Results 
reveal How to Win Friends and I nfluence People 
in fifth place. 

An earlier volume, not in an academic vein, 
but interesting in its uniqueness, insights, and 
entertainment value is Haldeman-Julius’ The 
First Hundred Million (3), an impressionistic 
portrayal of the writer’s experience in publish- 
ing and marketing his Little Blue Book series, 
Since all titles were advertised in list form, 
each sold by title and by title implication 
alone. Popular subjects proved to be: (1) Love 
and Sex; (2) Self Help; (3) Entertainment 
and Games; (4) Religion and Philosophy; and 
(5) Poetry and Literary Classics, 

Illustrative of another of the initial studies 
in the field is the work of Waples and Tyler (6). 
They found, in Part, that self-help subjects— 

how to be happy,” “how to keep well,” ete-— 
ranked among the foremost in interest, and 
that differences in interest existed in groups of 
varying educational Corroborating 
these results are the findings of Gray and 
Munroe (2) and of the NORC 6). 

The Present A proach 


masses, and to determine which ones have 
realized the greatest sale, The Publishers’ 
Weekly nonfiction bestseller lists Were examined 
for the period 1912 through 1950, (The 
Publishers’ Weekly, the book industry trade 
organ, annually publishes, usually in the third 


issue in January, a list of the ten fiction and 
ten nonfiction books achieving the greatest 
sale in bookstores.) This analysis of non- 
fiction best sellers showed that the practical 
psychology or self-help book was significant in 
number, being the third most popular type and 
accounting for 10.7 per cent of the aggregate 
of 345 issues. Most numerous were the 
biographical and autobiographical titles, 22 
per cent of the volumes being thus classified, 
followed by those with a social and economic 
problems emphasis, which group comprised 
13.3 per cent of the total. Other popular 
categories, in order of decreasing frequency, 
were found to be humor, travel and adventure, 
games and entertainment, history, philosophy 
and religion, war, science and health, geography, 
poetry, and plays and literature. The three 
most popular divisions—biography, social 
problems, and popular psychology—accounted 
for fully 46 per cent of all nonfiction best 
sellers. 

The psychological theme on The Publishers’ 
Weekly best seller lists embraces several 
subclasses: the nontechnical work of the 
scholar, represented by Jacobson’s You Must 
Relax (1934), and Jackson and Salisbury’s 
Outwitting Our Nerves (1922); the Art of Living 
book with Young’s A Fortune to Share (1932) 
or Carnegie’s How to Win Friends and I nfluence 
People! (1937-38) as illustrations; the simpli- 
fied work of the philosopher, such as Dimnet’s 
The Art of Thinking (1929-30) or Lin Yu tang’s 
The Importance of Living (1938); and the more 
recently noticeable religio-psychological syn- 
thesis such as On Being a Real Person by 
Fosdick (1943) and Peace of Mind by Liebman 
(1946-4 7-48). 

For the last thirty-five years, except during 
the war periods, Popular psychology or 
behavior books have Consistently occupied @ 


‘With a printing of 729,000 cos. w to Win 
Friends and Influence People in 1957 axreoicd alt other 
nonfiction in sales. Of this fourth most popular book 
of all time, 3,500,000 Copies have been sold. 


216 


i 


i Å Ime 


Best Sellers Among Popular Psychology Books 


place of importance on the best seller lists. 
Literature of the psychology and inspiration 
theme, ,with its diverse topical emphases, 
Teflects the tenor of the times more than most 
books. At the turn of the century, general 
treatments of psychology and other sciences 
accompanied the beginnings of the scientific 
era. In the nineteen-twenties, the widespread 
interest in occultism and autosuggestion was 
mirrored by volumes on these particular 
themes. During the depression years and the 
chaotic times since the last war, the popularity 
of the scientistic and psycho-religious themes 
made manifest many readers’ search for 
assistance. 

Among the books on psychological themes 
which have appeared on the yearly lists of the 
ten most popular nonfiction books are the 
following: How to Live on 24 Hours a Day by 
Bennett (1912), Crowds by Lee (1913), Laugh 
and Live by Fairbanks (1917-18), The Seven 
Purposes by Cameron (1919), The Mind in the 
Making by Robinson (1922-23), Outwilting Our 
Nerves by Jackson and Salisbury (1922), Self 
Mastery Through Conscious Autosuggestion by 
Coué (1923), The New Decalogue of Science by 
Wiggam (1924), Why We Behave Like Human 
Beings by Dorsey (1926-27), The Art of 
Thinking by Dimnet (1929-30), What We Live 
By by Dimnet (1932), A Fortune to Share by 
Young (1932), Life Begins at Forty by Pitkin 
(1933-34), You Must Relax by Jacobson (1934), 
Man the Unknown by Carrel (1936), Wake Up 
and Live! by Brande (1936), Live Alone and 
Like it by Hillis (1936), How to Win Friends 
and Influence People by Carnegie (1937-38), 
Orchids on Your Budget by Hillis (1937), The 
Importance of Living by Lin Yutang (1938), 
How to Read a Book by Adler (1940), Human 
Destiny by Lecomte du Notiy (1947), How to 
Slop Worrying and Start Living by Carnegie 


na i 


217 


(1948), Peace of Mind by Liebman (1946-47-— 
48), A Guide to Confident Living by Peale 
(1948-49), Peace of Soul by Sheen (1949), and 
The Mature Mind by Overstreet (1950). 


Summary 


Since certain earlier surveys have only 
subordinately touched upon the topological 
and quantitative characteristics of popular 
psychology books, the brief study here reported 
has more thoroughly investigated this problem. 
The results establish clearly the availability 
and importance of useful psychology volumes 
as well as the demand for them; and they 
reveal that these volumes comprise more than 
ten per cent of nonfiction best sellers, and that 
their general trends can be traced and classified 
into several topical emphases, 

To say that the list reflects the need for 
more vigorous publication and promotion of 
readable and sound works by qualified applied 
psychologists is hardly necessary, 


Received March 7, 1952. 
Early publication. 


References 


1, Gallup, G. The favorite books of Americans, New 
York Times Book Review, January 15, 1939, 

2. Gray, W. S., and Munroe, R. The reading interests 
and habits of adults, New York: The Macmillan 
Company, 1929. 

3. Haldeman-Julius, E. The first hundred million, 
New York: Simon and Schuster, 1928, 

Link, H. C., and Hopf, H. A. People and books: 
a study of reading and book-buying habits, New 
York: Book Industry Committee, Book Manu- 
facturers’ Institute, 1946. 

5. National Opinion Research Center. What-where- 

why-do people read. Report Number 28, Den- 

ver: University of Denver, 1946. 
6. Waples, D., and Tyler, R. What people want to va 
Chicago: American Library Association, 


about. i 3 
University of Chicago Press, 1931, 


Book Reviews 


Cattell, R. B. Personality: a systematic theo- 
retical and factual study. New York: Mc- 
Graw-Hill, 1950. Pp. xii+689. $5.50. 


This book is no mere compendium of iso- 
lated facts and theories about personality. 
It is a systematic integration of research find- 
ings, hypotheses and theories, with natural 
emphasis on the author’s own. 

The first four of the twenty-one chapters 
deal with the description and measurement 
of personality, and represent a selective sum- 
mary of an earlier work by the same author. 
Cattell emphasizes the conditional nature of 
historical distinctions between ability, tem- 
perament and motivation. Consonant with 
his wholistic approach, he contends that real 
life behavior is non-modal, and that without 
experimental preselection of variables many 
factors would subtend all three areas, 

After mustering a good deal of evidence 
for constitutional determination of the bulk 
of ability and temperament variance, the 
author turns to the nature and development 
of dynamic traits. Since it is through these 
highly plastic traits that most of the changes 
in personality are held to occur, considerable 
attention is devoted to psychodynamics. One 
of the more important innovations is the 
analysis of adaptation-adjustment problems 
in terms of six “dynamic crossroads.” This 
is a theoretical framework designed to permit 
a rapid resolution of any given conflict situ- 
ation into its essentials, 

Following the 120 page discussion of psycho- 
dynamics is a two-chapter summary of the 
significant findings in psychosomatics. Then, 
several chapters are devoted to the influence 
of the family and other social groups on per- 
sonality development. The remainder of the 
book deals with deviant personalities and a 
longitudinal analysis of personality develop- 
ment from conception to old age. 

For the personality theorist, the final chapter 
is the payoff. The scholarly array of experi- 
mental and clinical material in the rest of the 
book is in a sense but prologue to this final 
integrative attempt. Cattell formulates seven- 
teen “laws” of personality formation, with 


particular reference to those dynamic traits 
held responsible for personality development 
and change. In the last of these principles 
the author may cause many of his colleagues 
to raise a scientific eyebrow at his assertion 
that among the several factors limiting pre- 
diction there may be some as yet unrevealed 
indeterminancy inherent in psychological proc- 
esses. 

This volume is recommended by the author 
as a text for both undergraduate and graduate 
students. However, the reviewer is of the 
opinion that the level of theoretical sophisti- 
cation and proportion of controversial content 
make it more suitable for graduate courses 
and seminars. Also, the author presupposes 
some familiarity with quantitative methods, 
particularly factor analysis, which under- 
graduates acquire all too infrequently and 
graduate students reluctantly. 

The book is well documented (over 1,000 
end-of-chapter references), and carefully organ- 
ized, with each chapter rounded off by a 
summary and questions for discussion. More 
important, it is a significant book which should 
not be treated lightly by any student of per- 
sonality. 


Abraham S. Levine 
University of Illinois 


Stolzenberg, Jacob. Psychosomatics and sug- 
gestion therapy in dentistry. New York: 
Philosophical Library, Inc., 1950. Pp. xi 
+ 152. $3.75. 


Psychological factors in the practice of 
dentistry are receiving increasing attention. 
Dentists have always been aware that their 
own behavior and attitudes have a powerful 
effect upon the development of their practice, 
but they have received little instruction in the 
techniques of manipulating the behavior and 
attitudes of their patients. Likewise, little 
attention has been paid to the psychogenic 
factors involved in oral difficulties, 

The book under review considers primarily 
the latter two points; viz., 1. How can the 
dentist mold the behavior and attitudes of 
his patients in regard to dental matters? and 


218 


Book Reviews 219 


2. To what extent do psychological factors 
affect oral health? 

_ The author’s answer to the first question 
is the use of suggestion. Suggestion is most 
effectively used when the patient is in the 
hypnotic state. Therefore, methods of in- 
ducing hypnosis and the other topics revolving 
around hypnosis are briefly discussed. It 
would take too much space to list all that can 
be accomplished with the dental patient by 
the use of suggestion. Although in the past 
there has been a tendency to oversell the values 
of hypnosis, the present author seems to be 
in the main conservative in his statements. 
The reviewer is not a dentist, but he has 
worked with many dentists in the dental use 
of hypnosis and he can vouch from his own 
experience for most of what the author says 
in this regard. 

The author’s approach to his second pro- 
blem of psychosomatic dentistry is of necessity 
very tenuous because knowledge in that field 
is extremely limited. For as long as there 
have been dentists, the primary emphasis has 
been on the mechanical aspects of the pro- 
fession. ‘These mechanical skills of dentistry 
will forever remain extremely important, 
but some dentists are beginning to realize that 
the tooth is a part of a total organism and 
cannot be adequately treated in a purely 
mechanical fashion. Stolzenberg is a member 
of that very small band of dentists who are 
forging ahead to supplement their mechanical 


skills with psychological knowledge. 
hensive, however, 


The reviewer is a bit appre! 


that the dentist may go overboard in his 
enthusiasm for the psychosomatic approach. 
For example, Stolzenberg states in relation 
to the unconscious grinding of the teeth known 
as bruxism. “This (i.e., the unconsciousness) 
naturally follows the fact that bruxism is an 
expression of suppressed aggression, as not 
only the aggression must be suppressed into 
the grinding of the teeth, but any conscious 
knowledge of this grinding must also be sup- 
pressed.” Can we be so certain that sup- 
pressed aggressions produce bruxism? 

It is probably necessary to make dogmatic 
statements in order to arouse people to the 
possibility of new ways of thinking. If 
Stolzenberg can awaken his profession from 
their profound feeling of security in the mere 
mechanical skills of dentistry, then he may be 
excused for his perhaps too confident accept- 
ance of some of the dogma of psychoanalysis 
_ The psychologist who reads this book will 
likely feel very uncomfortable about some 
parts of it, but the book is written for the 
dentist. _ Whose responsibility is it to make 
the application of psychological knowledge to 
other professions? The reviewer believes that 
this responsibility rests primarily on the 
shoulders of the psychologist, and if he has 
neglected that responsibility he should not 
complain when members of other professions 
while struggling to make the applications for 
themselves, accept too uncritically some of 
the psychological dogma. 


: William T. Heron 
University of Minnesota 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota. 


Problems of consciousness, Harold A. Abramson, editor. 
New York: Josiah Macy, Jr., Foundation, 1951. 
Pp. 178. $3.25. 

The Greeks and the irrational. E.R. Dodds. Berkeley: 
University of California Press, 1951. About 352 
Pages. $5.00. 

The psychoanalytic study of the child. Edited by Ruth 
S. Eissler et al. New York: International Universi- 
ties Press, 1951. Pp. 398. $7.50. 

Adjustment to college. Norman Frederiksen and W. B. 
Schrader. Princeton: Educational Testing Service, 
1951. Pp. 504. 

Psychopathology of everyday life. Sigmund Freud. 
Translated by A. A. Brill. New York: The New 
American Library of World Literature, Inc., 1951. 
Pp. 168. $.35. 

Understanding children’s play. Ruth E. Hartley, Law- 
rence K. Frank, and Robert M. Goldenson. New 
York: Columbia University Press, 1952. Pp. 372. 
$3.50. 

Selection, training, and use of personnel in industrial 
research. David B. Hertz, editor. New York: Co- 
lumbia University Press, 1952. Pp. 274. $4.50. 

Social work education in the United States. Ernest V. 
Hollis and Alice L. Taylor. New York: Columbia 
University Press, 1951. Pp. 422. $5.50. 

Thinking, an introduction to experimental psychology. 
George Humphrey. New York: John Wiley and 
Sons, Inc., 1951. Pp. 324. $4.50. 

Family centered maternity and infant care. Edith B. 
Jackson and Genevieve Trainham, editors. New 
York: Josiah Macy, Jr., Foundation, 1950. Pp. 29. 
$,25 each. 26-100 copies, $,15 each. 101 copies or 
more, $.10 each. 

The psychology of C.G. Jung. Revised edition. 
Jacobi. 
$3.75, 

ae one a aag analysis of experiments. 

orne. New York: John Wiley 
Pp. 631, Soo John Wiley and Sons, Inc., 1952. 
sar oA : —_ E. H Manzies and E. Anstey. 
2 en i 
OE. 10s, 64. g! and Unwin, Ltd., 1952. Pp. 

Psychosurgical problems. Fr i 
Philadelphia: Blakiston oe eee mal cE 

The battle for mental health. James Clark Moloney. 
oa, York: Philosophical Library, 1952, Pp. 105. 


Jolande 
New Haven: Yale University Press, 1952. 


Oscar Kemp- 


Readings in personnel administration. Paul Pigors and 
Charles A. Myers. New York: McGraw-Hill Book 
Co., Inc., 1952. Pp. 483. $4.50. 

Adrenal cortex. Elaine P. Ralli, editor. 
Josiah Macy, Jr., Foundation, 1951. 
$3.00. 

Handwriting, a key to personality. Klara G. Roman. 
New York: Pantheon Books, Inc., 1952. Pp. 382. 
$6.50, 


New York: 
Pp. 209. 


Union solidarity, Arnold M. Rose. Minneapolis: 
University of Minnesota Press, 1952. Pp. 209. 
$3.00. E 

Experimenta psychologica. Edgar Rubin. Copenha- 


gen: Ejnar Munksgaard, 1949. Pp. 356. $4.75. 

Health counseling. Milton Schwebel and Ella Freas 
Harris. New York: Chartwell House, Inc., 1951. 
Pp. 238. $3.00. 

Prisoners are people. Kenyon J. Scudder, Garden 
City, N. Y.: Doubleday and Co., Inc., 1952. $3.00. 

The graphologist’s alphabet. Eric Singer. New York: 
Philosophical Library, 1951. Pp. 118. $3.75. 

The explanation of human behaviour. F. V. Smith. 
New York: The Macmillan Co., 1952. Pp, 276. 
$2.75. 

The single woman of today. M. B. Smith. New York: 
Philosophical Library, 1952, Pp. 130. $2.75. 

Basic psychiatry, Edward A. Strecker. New York: 
Random House, Inc., 1952. Pp. 512. $3.75. 

Child psychology. George C. Thompson. Boston: 
Houghton Mifflin Co., 1952. Pp. 667. $5.50. 

Industrial psychology. Third edition. Joseph Tiffin. 
New York: Prentice-Hall, Inc., 1952. Pp, 640. 

Humanistic ethics. Gardner Williams. New York: 
Philosophical Library, 1951. Pp. 223. $3.75. 

Personality and problems of adjustment. Kimball Young. 
Second edition. New York: Appleton-Century- 
Crofts, 1952, Pp. 716. $5.00. 

Lighting handbook. Second edition. Mluminating En- 
gineering Society. New York: Illuminating Engi- 
neering Society, 1952. Pp. 987. $8.00. 

The social welfare forum, 1951. National Conference 
of Social Work. New York: Columbia University 
Press, 1951. Pp. 380. $5.00. , 

Frontiers in medicine. New York Academy of Medi- 
cine. New York: Columbia University Press, 1951. 
Pp. 150. $2.50. 

Education of the gifted child. No.97 of the Curriculum 
Bulletin. Eugene: Curriculum Bulletin, University 
of Oregon, 1952. $.55. 


220 


a 


Journal of Applied Psychology 


VoL. 36, No. 4 


AUGUST, 1952 


Is How Supervise? an Intelligence Test? * 


Kenneth A. Millard 
Macalester College 


The conviction that How Supervise? “is in 
large part merely a test of intelligence” was 
expressed by Slocombe (10) before any evidence 
on the point was available. To this date 
apparently only one published report of the 
relationship between How Supervise? and a 
mental ability test has appeared. And that 
report was badly misinterpreted in a later 
article that commented upon it. At the same 
time, Harrell (4, p. 137) and especially 
Mosier (6) have echoed Slocombe’s conviction. 

In an aircraft plant Sartain (9) obtained a 
product-moment correlation of —.44 (signif- 
icant at the 1% level) between the Adaptability 
Test, Form A and How Supervise?, Exper- 
imental Edition, Form A for a group of 40 
supervisors. File (1) had reported previously 
a correlation of .35 between highest educational 
level reached and How Supervise? for a group 
of 577 factory supervisors. Following publica- 
tion of Sartain’s findings, File and Remmers 
(2) attempted to explain what they considered 
the marked discrepancy of these two studies. 
But apparently File and Remmers overlooked 
a parenthetical statement in Sartain’s article 
that removes any necessity for explaining a 
discrepancy since none exists. ; , 

Sartain, in commenting upon his findings, 
wrote: “The correlation between Adaptability 


and How Supervise? indicates that general 
mental ability goes with favorable supervisory 
attitudes (low scores on this test indicating a 
favorable attitude) to a moderate degree” 


* This article is based on the findings reported in the 
writer’s thesis submitted in partial fulfillment of the 
requirements for the Ph.D. degree in psychology at the 
University of Minnesota. The data reported here on 
supervisors of newspaper carriers and dealers were not 
included in the thesis, although gathered at the same 
time. The writer wishes to thank his major adviser, 
Professor D. G. Paterson, for important guidance 
throughout the planning and conduct of the study. 


(9, p. 331). The published method of scoring 
the test is such that high scores, rather than 
low scores (as indicated by Sartain), represent 

favorable” supervisory attitudes. For some 
unknown reason Sartain apparently used a 
“reverse” method of scoring, and actually his 
findings and those of File arein close agreement. 


Procedure 


The present study obtained further evidence 
on the relation between How Supervise? and 
measures of intelligence. Three different kinds 
of supervisors were included. Of the 49 factory 
supervisors, all men, 28 worked in a flour mill 
and 21 in a textile manufacturing plant. 
Eighteen of the 71 office supervisors were 
employed in the home office of a life insurance 
company. The other 53 were distributed among 
the accounting, advertising, circulation, news 
and features, and promotion departments of a 
metropolitan newspaper. Twelve of the 
office supervisors were women. These factory 
and office supervisors were all from the lower 
levels of management. Very few of them 
supervised only other supervisors, although 
many of them had responsibility over some 
other supervisors as well as directly over 
non-supervisory employees. 

The third group, also employed by the 
newspaper, supervised carriers and dealers. 
This group of 77 included 51 city district 
managers, 18 country district managers, and 
eight zone supervisors (who supervise only 
district managers). Seventeen of the city 
district managers were women; all others in 
this total group were men. 

Each of the supervisors in all three groups 
was administered How Supervise?, Form A and 
the Adaptability Test, Form B. Both of these 


221 


222 


were answered anonymously, since an opinion 
questionnaire about company policies and 
practices was one of the other measures used. 
So that all measures for the same individual 
could be compared, a code number was used 
for each individual. 

For each of the three groups of supervisors, 
the product-moment correlation between 
Adaptability Test and How Supervise? scores 
was obtained. (No significant sex differences 
in either mean or variance for either test were 
found, so data for men and women were 
combined.) Calculations were made from 
grouped data. The assumption of rectilinear- 
ity of regression was tested for each super- 
visory group. In no case was the deviation 
from rectilinearity sufficient to warrant rejec- 
tion of the assumption. The only test of the 
assumption of homoscedasticity was the sub- 
jective one of visual inspection of the scatter- 
gram. 

Because of the known substantial relation- 
ship between tested intelligence and highest 
educational level reached, this study also 
determined the relationship between the 
latter and How Supervise? scores. Each 
supervisor, on a personal information question- 
naire, checked one of the five following 
categories: 8th grade or less, attended high 
school but didn’t graduate, graduated from 
high school, attended college but didn’t 
graduate, graduated from college. Then, for 
each of the three groups of supervisors, analysis 
of variance was applied to determine whether 
there was significant difference in mean How 
Supervise? scores among the educational level 
groups. _ (Using the method of Welch, the 
assumption of homogeneity of within-groups 
variances was tested and in each case found to 
be a tenable assumption.) Where analysis of 


Kenneth A. Millard 


variance revealed significant difference among 
these means, epsilon, the unbiased correlation 
ratio (8), was computed to get an indication of 
the degree of relationship between highest edu- 
cational level reached and How Supervise? 
scores. If the assumption of rectilinearity of 
regression is warranted, then epsilon is com- 
parable to the product-moment correlation 
coefficient. 


Results 


Factory Supervisors. The relation between 
Adaptability Test and How Supervise? scores 
is presented in Table 1. The 7 of .71 for the 
factory supervisors is not much less than the 
equivalent form (Forms A and B) reliability 
of .77 for How Supervise? (3). The 99% 
confidence limits for this r of .71 are .47 and 
85. For our particular group of factory 
supervisors, 50 per cent of the variance in 
How Supervise? scores is accounted for by the 
variance in Adaptability Test scores, 

Analysis of variance revealed a significant 
(1% level) relationship between highest educa- 
tional level reached and How Supervise? for 
the factory supervisors. The value of epsilon, 
computed to determine the degree of this 
relationship, was .45. This value is similar to 
the correlation of .35 reported by File (1) 
for the same relationship. 

Office Supervisors. The r between Adapl- 
ability Test and How Supervise? scores for the 
office supervisors is .22, which is significant 
only at the 8% level. Moreover, the relation- 
ship between highest educational level reached 
and How Supervise? for this group of super- 
visors is not significant at even the 5% level. 

Carrier and Dealer Supervisors. For the 
supervisors of newspaper carriers and dealers 
the r between Adaptability Test and How 


. Table 1 
Relation Between Adaptability Test and How Supervise? 


Adaptability How Supervise? d 
i tA ility an 
eee N Mn SD Mn s RH 
y 49 129 6.7 42.0 12.0 yar 
p s 71 20.9 60 45.7 88 2 
Newspaper carriers and dealers 77 16.9 69 421 103 A 


** Significant at the 1 per cent level, 


S 


= 


Is How Supervise? an Intelligence Test? 


Supervise? scores is .62, a value significant at 
the'1% level. The relation between highest 
educational level reached and How Supervise? 
for this group is significant at the 5% level, 
with the degree of the relationship being 
indicated by an epsilon of .30. 


Discussion 


It is pertinent to compare our r of .71 for 
factory supervisors with Sartain’s (9) r of .44 
(giving it a plus sign to have a comparable 
meaning) between the same two measures for 
a group of aircraft plant supervisors. The 
difference between these two r’s reaches signif- 
icance at the 6% level. Whether our larger r 
might be at least partly a function of greater 
variability for either or both measures is 
unknown because Sartain did not report any 
measures of variability. It might be that the 
experimental edition of How Supervise?, which 
he used, was not as much a measure of intel- 
ligence as the later editions. 

The findings from the present study focus 
particular concern on the substantially lower 
relationship between intelligence and How 
Supervise? for the office supervisors than for 
either of the other two groups of supervisors. 
(The differences between the office supervisors’ 
r and that for each of the other two groups are 
significant at considerably less than the 1% 
level; on the other hand, the difference between 
the 7’s for the other two groups does not even 
approximate significance at the 5% level.) 

To seek an explanation of this lower r 
for the office supervisors we first take a look 
at the variabilities of the three groups on the 
two measures (cf. Table 1). (Incidentally, in 
the case of How Supervise? the means and 
variabilities for our groups of supervisors 
approximate the values reported in the test 
manual for comparable groups. For the 
Adaptability Test the test manual does not 
give data for any directly comparable groups.) 
Comparing the variances on the Adaptability 
Test for each of our pairs of supervisory groups, 
we find none of the F values significant at even 
the 10% level. Two of the three pairs do not 
have significant differences for variance on 


How Supervise? either. However, the differ- 
ence in variance on How Supervise? between 


the factory and office supervisors is significant 


223 


at the 2% level. Therefore, we know that at 
least part of the reason for the smaller r for the 
office group is its smaller variability. To 
determine the effect of this smaller variability 
on the y for the office group, a new r was 
computed, using the How Supervise? SD of the 
factory group as the uncurtailed SD for that 
measure. This new 7 for the office group was 
.29, a value just barely short of significance 
at the 5% level. 

Since differences in variability account for 
very little of the difference in r’s we next look 
at differences in means. On the Adaptability 
Test all three differences between means are 
significant at probabilities of 1% or less. In the 
case of How Supervise? only one of the differ- 
ences between means is significant; viz., that 
between the office supervisors and the carrier 
and dealer supervisors, which is significant at 
the 5% level. So we find no consistent pattern 
between differences between means on the 
one hand and differences between 7’s on the 
other. (In determining significance of differ- 
ence between means the ż test was used in all 
instances but one. For the difference between 
means of office and factory supervisors on 
How Supervise?, t is not appropriate because 
of the significant difference in variances, 
Therefore, in that case the Behrens-Fisher d 
test (5, 11) was used.) k 

Another relevant comparison is that for 
highest educational level reached. The distri- 
bution for each supervisory group on this 
variable is presented in Table 2. Each of the 
pairs of distributions was compared by means 
of the chi-square test. For each of the three 
comparisons the chi-square value was signif- 
icant at the .1% level, a finding consistent 
with the significance of differences between 
means on the Adaptability Test. 

It is seen then that neither differences in 
variability nor differences in means provide 
any adequate explanation for the markedly 
lower z of the office supervisors. There 
remains one additional characteristic which 
warrants examination; viz., the relative vari- 
ability on each measure for the three groups. 
The use of the coefficient of relative variation 
(the ratio of SD to the Mn, expressed as a 
percentage) with psychological measurements 
has been questioned because of the lack of a 
true zero point and unequal units of measure- 


N 
iS) 
= 


Table 2 


Frequency Distributions for Highest Educational 
Level Reached 


Supervisory Group 


Newspaper 
Carriers 
and 
Educational Level Factory* Office Dealers* 
8th grade or less 18 2 5 
Attended high school but 
didn’t graduate 15 8 6 
Graduated from high school 7 14 45 
Attended college but didn’t 
graduate 4 19 17 
Graduated from college 4 28 


* One factory supervisor and one carrier and dealer 
supervisor did not give information about highest edu- 
cational level reached. 


ment. However, Peatman (7, p. 171) contends 
that it is legitimate to compare the coefficients 
of relative variation of two groups on the 
same psychological measure. He also presents 
a test of significance for such a difference. 
When we compare the coefficients of relative 
variation on the Adaptability Test for our 
three groups, we obtain these interesting 
findings: the difference between the office 
group and each of the other two groups is 
significant at the 1% level, while the difference 
between the other two groups is not significant 
at even the 5% level. A similar pattern shows 
up when we compare the coefficients of relative 
variation on How Supervise?. The office vs. 
factory difference is significant at the 1% level, 
the office vs. carriers and dealers difference is 
significant between the 5% and 6% level, and 
the factory vs. carriers and dealers difference 
does not even approximate significance at the 
5% level. It is clearly seen, then, that the 
pattern of differences between coefficients of re- 
lative variation for both measures corresponds 
with the pattern of differences between 7’s. 


Summary 


1. Using the Adaptability Test as a measure 
of intelligence, this study found a substantial 
correlation between intelligence and How 
Supervise? for factory supervisors and super- 
visors of newspaper carriers and dealers. 
For office supervisors, this correlation was 


Kenneth A. Millard 


considerably smaller and of less certain 


significance. 

2. Using highest educational level reached 
as an indirect measure of intelligence, similar 
relationships were found, but of a smaller 
degree. 

3. Only a very small part of the markedly 
lower relationship between intelligence and 
How Supervise? for the office supervisors is 
accounted for by differences in variability. 
Nor do differences in means show a pattern 
similar to the differences in intelligence-How 
Supervise? relationship for the three groups of 
supervisors. However, the pattern of differ- 
ences in relative variability corresponds pre- 
cisely with the pattern of differences for this 
relationship. 

4. The frequently expressed conviction that 
How Supervise? is essentially an intelligence 
test seems to be considerably substantiated 
for two kinds of supervisors, but not for a 
third, by this study. 

Received August 23, 1951. 


References 


1. File, Q. W. The measurement of supervisory 
duality in industry. J. appl. Psychol., 1945, 29, 
323-337. 

2. File, Q. W., and Remmers, H. H. Studies in 
supervisory evaluation. J. appl. Psychol., 1946, 
30, 421-425. í 

3. File, Q. W., and Remmers, H. H. Manual for How 
Supervise? (Rev. Ed.) New York: The Psy- 
chological Corporation, 1948. 

4. Harrell, T. W. Industrial psychology. 
Rinehart, 1949. i 

5. Johnson, P. O. Statistical methods in research. 
New York: Prentice-Hall, 1949. | 

6. Mosier, C.I. Review of How Supervise? InQ.K. 
Buros (Ed.), The third mental measurements year- 
book. New Brunswick: Rutgers University Press, 
1949. Pp. 727-728. i 

7. Peatman, J. G. Descriplive and sampling statistics. 
New York: Harper, 1947. 

8. Peters, C. C., and Van Voorhis, W. R. Statistical 
procedures and their mathematical bases, New 
York: McGraw-Hill, 1940. 

9. Sartain, A. Q. Relation between scores on certain 
standard tests and supervisory success in an 
areni factory. J. appl. Psychol., 1946, 30, 

10. Slocombe, C. S. Appraisal of Mr, File’s study- 
Person. J., 1946, 24, 251-254. 

11. Sukhatme, P. V. On Fisher and Behrens’ test of 
significance for the difference in means of tw0 
normal samples. Sankhya: The Indian J. Sla- 
tistics, 1938, 4, 39-48. 


New York: 


Reading Ease Scores for File’s How Supervise? 


Paul W. Maloney 


University of Minnesota 


File (2), in building How Supervise?, defined 
supervisors as “individuals between the group 
leader and departmental supervisor levels.” 
As Bellows (1) and Tiffin (7) have shown, such 
a group has a wide range of reading ability. 
Therefore, if these people are to be tested for 
supervisory ability and nothing else | the 
directions and test items should be highly 
readable. This point was recently clarified by 
Johnson and Bond (5) in relation to vocational 
guidance tests commonly given at V. A. 
Advisement Centers. : 

File (2) recognized this. He said that the 
items “, . . must be simply worded so that 
any supervisor can see the problem involved’ 
(italics File’s). His criterion of readability, 
however, was merely the opinion of various 
industrial experts, nine-tenths of whom thought 
the test to be clearly written. A more objec- 
tive criterion would be the application of 
Flesch’s (3) reading ease formula to the 
directions and test items. Such an analysis 
is reported here. The total content (rather 
than random samples) of the directions and 
items of Forms A and B has been subjected 
to the formula. 


Results 


The results of the Flesch analysis are 
summarized in Table 1. Flesch’s (4) grade 


level equivalents are included. Note that 
three of the sections of test items are written 
for individuals with 10th to 12th grade 
reading ability. Two sections and the direc- 
tions are at the high school graduate level. 
One section has a score of 50, lying just 
between the two above mentioned levels. 
The mean readability for both forms, directions 
included, is of high school graduate difficulty. 
If reading ability is not to be a factor in the 
response made to the test, it seems safe to 
conclude that those who are tested with How 
Supervise? should have high school graduate 
reading ability. 

The obvious question, then, is as to whether 
or not this is the case. Do most industrial 
supervisors have the reading ability requisite 
to taking How Supervise? A person’s reading 
grade level can be roughly related to his 
educational level (very roughly—Johnson and 
Bond (5) caution that “. . . an individual's 
reading level may vary as much as six grades 
from his educational level”). The educational 
attainment of industrial foremen is listed in 
Table 2. This information is from a national 
sample (5 per cent of the population) drawn 
in the 1940 U. S. census. 

It is apparent that the range of educational 
attainment of industrial supervisors is wide. 
The median number of years of schooling 


Table 1 


Reading Ease of File’s How Supervise? 


Test 


Average Number Readin 
Palin i of Syllables Ease = | 

* Form Section Length per 100 Words Score Flesch’s Grade Level 

A 1 11.3 169.8 52 Some high school 

A 2 12.8 184.2 38 High school or some college 

A 3 15.6 160.5 56 Some high school 

B 1 13.4 165.1 54 Some high school 

B 2 12.9 182.9 39 High school or some college 
\ B 3 14.4 168.4 50 High school 
bi Both/Directions 143 170.6 48 High school or some college 

Means for Total 13.5 171.6 48 High school or some college 


226 


Table 2 


Education Attainment of Male Industrial Supervisors* 


Foremen 
Grade Reached Number Per Cent 

College: 

4 years or more 12,960 2.64 

1 to 3 years 26,640 5.42 
High School: 

4 years or more 81,180 16.51 

1 to 3 years 102,600 20.87 
Elementary School: 

7 and 8 years 197,840 40.24 

5 and 6 years 41,220 8.38 

Less than 5 years 25,320 5.15 
Schooling not reported 3,860 .79 

Totals 491,620 100.00 


* Population, The Labor Force, Occupational Charac- 
Baa U. S. Bureau of Census, 16th Census; pp. 59 
an $ 


completed is only 8.8. Less than a quarter 
of the foremen in the sample had a bigh 
school education. We noted that a large 
part of How Supervise? is at the high school 
graduate reading level. Reading ability, 
insofar as it is related to educational attain- 
ment, is a factor which would be likely to 
influence the scores of most industrial super- 
visors when taking the test. 

Evidence of this contamination may be 
found in a study by Millard (6). He compared 
scores on How Supervise? with scores on the 
Adaptability Test. For office (higher level) 
supervisors there was a low correlation, .22. 
However, the correlation for a group of factory 
(lower level or first line) foremen was .71. 
This difference appears to be a function of 
reading ability. Tt is unlikely that many of 
the office supervisors had much difficulty 
reading How Supervise? But within the 
group of factory foremen there probably were 
many who had difficulty understanding mate- 
rial written for people of high school graduate 
ability. Their scores would then reflect the 
extent to which they could comprehend the 
material, rather than their knowledge of the 
principles of supervision. The Adapiabilit 
Test also measures the ability to destani 
written material. So for factory foremen of 
limited schooling it is only natural that 


Paul W. Maloney 


Adaptability Test scores should correlate highly 
with How Supervise? scores. 

Readability as measured by Flesch is a 
function of the numbers of syllables and words. 
How Supervise? needs revision in both respects. 
For example, item 51 in Form A now reads 
“The only important requirement of a good 
supervisor is a complete understanding of the 
jobs he is to supervise.” Sentence and word 
length are unnecessarily long. A shorter 
version might read “All that supervisors 
need really know are the jobs they supervise.” 
Similarly, “The goals of management and 
labor are directly opposed and must always be 
in conflict with each other” (item 69, Form A) 
would carry the same connotation if more 
simply phrased, “ Management and labor have 
different aims which will always conflict.” 
Again, “Constant demands upon the time of 
top-executives make it impractical for them 
to spend any time in actual conversations with 
workers” (item 63, Form B) could just as well 
be“ Top managers are so busy that they should 
not be asked to talk to the workers.” 


Summary 


The readability of the directions and items 
in File’s How Supervise? is at the difficult level. 
Much of the material is at the high school 
graduate level of readability. In contrast, less 
than a quarter of a nation-wide sample of 
foremen were high school graduates. To’ the 
extent to which educational attainment is an 
indication of reading ability the test can be 
read only with great difficulty by the average 
factory foreman. Therefore, for lower level 
personnel How Supervise? 1s of doubtful 
validity as a measure of supervisory ability- 
Three items were rephrased to show how the 
test could be made more readable. 


Received December 5, 1951. 
Early publication. 


References 


1. Bellows, R. M. Psychology of personnel in busine 
and industry. New York: Prentice Hall, E 

2. File,Q. W. The measurement of supervisory qual 
in industry. J. appl. Psychol., 1945, 29, 32 
337. 

3. Flesch, R. A new readability yardstick. J. apt f 
Psychol., 1948, 32, 221-233. 


Reading Ease Scores for File’s How Supervise? 227 


4. Flesch, R. How to test readability. New York: 6. Millard, K. A. Is How Supervise? an intelligence 


Harper and Bros., 1951. test? J. appl. Psychol., 1952, 36, 221-225. 
5. Johnson, R. H., and Bond, G. L. Reading ease of 7. Tiffin, J. T., and Colby, A. N. The reading ability 
commonly used tests. J. appl. Psychol., 1950, of industrial supervisors. Personnel, 1950, 27, 


34, 319-324. 156-159. 


Temperament Traits of Executives and Supervisors Measured by the 
Guilford Personality Inventories * 


Joan S. Guilford 


University of Southern California 


Because of the dearth of effective selection 
devices for screening of executives and super- 
visors in industry, this study was conducted 
as an attempt to uncover traits of temperament 
which might be important in the successful 
performance of duties by men in these positions. 
The meager information concerning the use of 
temperament tests in connection with executive 
and supervisor selection makes it difficult to 
evaluate these devices for selection of such 
personnel. However, since the pencil-and- 
paper test is the most available medium at 
present for the assessment of temperament 
traits, any investigation of its effectiveness 
would appear to be worthwhile. 

Previous studies have consistently revealed 
low validity for scores on personality tests in 
vocational uses. In several cases, however, 
there has been some improvement in prediction 
of success and any improvement in prediction, 
particularly at these high levels in the organiza- 
tional hierarchy, is valuable It is generally 
agreed by authorities in the field of personnel 
selection that traits of temperament contribute 
greatly to the success or failure of men in 
executive and supervisory positions, while 
technical skills which are more easily evaluated 
are relatively less important. Because of the 
necessity for well-qualified individuals at 
these high levels and the relatively large 
contribution of temperament traits to their 
success, there is great need for research in the 


field of personality testing in the industrial 
situation. 


The Problem 


The purpose of this study was twofold 
It was desired to determine: (1) in what 
specific respects the executive differs from the 
supervisor with regard to certain personality 
traits and (2) the validity of these same traits 
as related to a criterion of job success, defined 
in terms of ratings of “job performance.” 


* This article consists of portions draw 
aie : n f a 
unpublished Master’s thesis by the writer (1). eee TAER 


The Procedure 


The subjects of this study were: (1) 208 
executives in a large chain grocery'; and (2) 
143 supervisors in the same company. The 
executive in this organization serves as a link 
between branch headquarters and the stores. 
He is responsible for supervising the store 
managers of from twelve to twenty stores in 
his district, promoting sales, controlling 
expense consistent with maximum sales, and 
developing and maintaining competent organ- 
izations in his stores. He employs, disciplines, 
and separates employees in his group or is 
responsible for recommending such action. 
The supervisor (or foreman) directs the 
employees in his plant or department, working 
with them at all times. He is responsible for 
the quantity and quality of production in his 
department, for maintaining satisfactory work- 
ing conditions, making decisions regarding 
production problems, handling employee griev- 
ances, making suggestions for improvements 
in the operations he supervises, scheduling and 
training employees, and employing, disciplin- 
ing, and separating workers under him or 
making recommendations for such action. 


To further differentiate between these groups, 
it might be added that in choosing @ type of 
work the district manager (executive) was 
originally attracted to the | retail culture 
which is assumed to emphasize gregarlousness 
and ascendance; the foreman (supervisor) was 
attracted to the “factory and warehouse cul- 
ture” involving physical labor and more tan- 
gible and obvious duties. The executive is ‘‘on 
his own” in a geographic territory and makes 
many independent decisions. He provides 
leadership for his area. The store managers 
whom he supervises are at a level equivalent 
to (and in most cases higher than) the foreman 
group. He typically has 100 to 150 employees 
in his district. The foreman (or supervisor 
usually supervises 20 to 30 men in the unskilled 
or semi-skilled labor classifications. He oper- 
ates a department or unit in a factory oY 


1 The writer wishes to thank Mr. Glenn E. Mitchell 
of the Kroger Company for furnishing the data and 
supplementary information used in this study. 


228 


at 


| 
| 


Temperament Traits of Executives and Supervisors 229 


warehouse and is himself under fairly direct 
supervision. 

The tests administered to these two groups 
were tle Guilford series of personality inven- 
tories: Inventory of Factors STDCR (2); 
Guilford-Martin Inventory of Factors GAMIN 
(3); and Guilford-Martin Personnel Inventory 
(4). The traits represented in these tests 
are: (1) Social Introversion-Extraversion; (2) 
Thinking Introversion-Extraversion; (3) De- 
pression; (4) Emotional Stability; (5) Rhathy- 
mia or Impulsiveness; (6) General Activity; 
(7) Ascendance or Social Boldness; (8) Mascu- 
linity of Attitudes and Interests; (9) Inferiority 
Feelings; (10) Nervousness; (11) Objectivity; 
(12) Agreeableness; and (13) Cooperativeness. 

These factors emerged from several previous 
factor analyses and so might be assumed to 
represent relatively distinct dimensions of per- 
sonality. The reliabilities of the scores have 
been found to be adequate, ranging generally 
from .80 to .92. . 

The criterion was a composite of ratings of 
“job performance” made, in the case of the 
executives, by members of the training staff. 
The raters were two in number and the ratings 
were made independently. In the case of the 
supervisors, ratings were made by individuals 
occupying a position midway between the 
supervisor and the executive, the superintend- 
ents. These ratings were in letter form: an 
“A” rating was defined as “very outstanding 
performance,” “B” as “good performance, 
“C” as “average performance, and “D” as 
“just passing performance. In both groups 
the men were experienced workers and were 
rated while on the job. For the sake of com- 
putation these letter ratings were converted 
into numerical ratings of 30, 20, 10, and 0 for 
A, B, C, and D, respectively. Since the execu- 
tive’s ratings were presented separately, it was 
possible to compute their intercorrelation which 
proved to be .53. Assuming this to be a relia- 
bility index, the combination of the ratings 
attains a Spearman-Brown reliability estimate 
of .69. In the case of the supervisors only the 
pooled rating was available. ‘ 

It is difficult to determine the extent to which 
the testing conditions might have controlled 
extraneous sources of variance since the tests 
were administered in branches of the company 
scattered throughout the East and Middlewest. 
The greatest danger, of course, lies in the possi- 
bility of uncontrolled motivation which might 
lead to falsification of responses on the test 
items. In this particular case, the tests were 
given experimentally so that tendency to falsify 
is probably much smaller than it would be were 
these tests used as selection devices. This is 
one of the most serious weaknesses of these 
tests, that, with sufficient psychological insight, 
an individual can alter his profile of traits to fit 
what he believes should be typical of the indi- 


vidual who successfully fills the position for 
which he is an applicant. 


Results 


The first statistical procedure was to deter- 
mine means and standard deviations for the 
two groups. Then, for purposes of comparison, 
1 ratios were computed for the differences 
between the means of the two groups on each 
trait. These appear in Table 1. To simplify 
interpretation, profiles of the two groups, 
based on the mean scores, were plotted on the 


Table 1 


Means, Standard Deviations, Differences 
Between Means and ¢ Ratios 


Test Standard 

Group Variable Mean Deviation t 
E S 12.2 7.20 8.69** 
S S 19.8 8.61 
E NF 34.3 7.74 -60 
S E 34.8 7.89 
E D 10.8 7.35 6.00** 
S D 16.1 8.49 
E C 15.2 8.76 4.00** 
S C 19.5 10.48 
E R 41.2 8.56 319m 
S R 38.0 9.68 
E G 12.9 4.22 aars 
S G 10.8 4.64 
E A 26.0 4.58 6.81** 
S A 21.8 6.33 
E M 22.3 4.46 23 
S M 22.4 4.48 
E I 42.1 5.04 5.48** 
S I 38.5 6.93 
E N 34.1 6.42 iai 

.42 3:27* 

S N 31.5 ahi ‘ 
E o 59.6 9.30 3.00%" 
S o 55.3 11.70 
E Ag 40.4 8.27 2.20* 
s Ag 38.3 9.24 
E Co 78.0 14.00 6.54** 
S Co 67.0 16.30 


os Significant at the 5 per cent level. 
** Significant at the 1 per cent level. 


230 Joan S. Guilford 
CScore $ T D e R G A M 1 N o Ag Co  C-Score 
| o o o 
p a 4 a+ 74 564 974+ 10 
10 o 10 1 5 70+- 24 35. 30-4 48+ + + 
| 
3 1 “n 2 6 69 23 34 29 47 40 70 55 9% E 
3 14 3 P 64 22 33 28 45 38 68 53 89 
ï 4 15 6 10 63 2 32 27 44 7 67 52 88 7 
6 19 9 13 58 20 30 26 43 35 62 a7 al 
7 20 7 
7 n 25 | 
(aN Lan fa" r 
6 36 P 28 ) 49 
NST NESA A N 
s E N K 2% 9 35 7 48 36 64 = 
4 42 32 54 
Keres pee) es e a a 
h 24 38 26 32 37 rf n x 18 7 32 23 a a 53 k 
28 42 31 38 30 ey ae 13 28 20 36 27 45 
3 29 43 32 39 29 8 14 12 27 19 35 26 44 3 
33 47 38 46 23 7 12 “n 2 16 28 22 35 
ž 34 48 39 47 2 6 “n 10 21 15 27 2 34 4 
39 53 45 52 16 5 8 15 “u 20 v7 7 
: 40 54 46 53 15 4 7 7 14 10 19 16 26 1 
46 6 53 58 10 3 5 6 “n 7 13 12 20 
9 4 5 10 6 12 n 19 
0 47+ 624 54 i o 
ia P a o o ° ° o o o o 
E 
C-Score $ T D c R G A M 1 N o Ag Co CScore 
Executives —— 
Supervisors ==- 
Fic. 1. A comparison between executives and supervisors. 


background of the norms obtained for a 
population heavily weighted with college 
students, provided by the authors of the 
tests. These profiles appear in Figure 1. 

The next procedure was to compute Pearson 
r correlations between test scores and ratings 
for each of the traits in the two groups. Since 
the ratings covered such a small range, it 
was necessary to correct the coefficients for 
coarse grouping. The correlations are pre- 
sented in Table 2. 

Assuming that the difference in level of 
position indicates to some extent a variable 
of supervisory success or excellence, scores on 
each trait were correlated with this executive- 
supervisor dichotomy using the point-biserial-r 


formula. The point-biserial correlati 
in Table 3. ions appear 


Results 


From Table 1 it can be seen that the exe 
tive averages significantly more: (1) sociable 
(2) free from depression, (3) emotionally 
stable, (4) happy-go-lucky or impulsive, (5) 


active, (6) ascendant or socially bold, (7) self- 
confident or free from inferiority feelings, 
(8) calm and composed or free from nervous- 
ness, (9) objective, (10) agreeable, and (11) 
cooperative than the supervisor. Since the 
executive can be assumed to be more successful 
than the supervisor (if success is defined in 
terms of status, responsibility and income), it 
can also be assumed that he wiil possess a 
somewhat different pattern of traits than will 
the supervisor, if temperament contributes 
to his success. 

It can be seen from the profiles (Fig. 1) 
that while these differences are evident, the 
more outstanding feature is the close similarity 
between the two profiles. The personality 
patterns might be said to follow the same 
general trend. It might also be noted that 
both of the groups are generally quite a bit 
above the population mean represented by 
college students. This deviation appears to 
be in the direction of “better adjustment.” — 

The validity coefficients proved to be low in 
most cases. For executives the variables 


Temperament Traits of Execulives and Supervisors 


Table 2 


Pearson r Correlations Between Scores on Fourteen 
ə Test Variables and the Criterion of Job 
Performance Ratings 


Group Variable Pearson r* 
E “5 —.16 
S S —.03 
E T —01 
S T +.03 
E D —.09 
S D —.13 
E C —.04 
S cC —.19 

2 R 12 
3 R —.13 
E G —.04 
S G —.02 

D A .07 
a A -03 
E M 18 
sf M ll 

D I 5 
z I AS 
E N .02 
s N .18 
'E oO —.02 
S o .16 
E Ag .06 
S Ag 10 
E Co 15 
S Co .18 


Note: Correlations printed in bold-face type are sig- 


i the 5 per cent level. , i 
nicat rables, T, D and C a negative correlation 


indicates a positive relationship between mo oritenon 
and the traits of sociability, thinking inho 7 x 
freedom from depression and emotional stability. 


which correlated significantly with the criterion 
of success were: (1) sociability, (2) freedom 
from inferiority feelings, (3) cooperativeness, 
and (4) masculinity. For supervisors the 
significant Pearson r’s were for variables: 
(1) emotional stability, (2) freedom from 
nervousness, (3) cooperativeness. Previous 
studies have indicated that the trait of coopera- 
tiveness as measured by the Personnel Inventory 


231 


has most consistently shown a relationship 
with vocational success. 

The interpretation of the validities obtained 
is that the more successful the executive, the 
more sociable, self-confident, cooperative and 
masculine he will tend to be (to an extent 
defined by the size of the correlation in each 
case). Greater degrees of emotional stability, 
composure, and cooperativeness will be ex- 
pected of the more successful supervisor. 

Because in some cases the regressions 
suggested some curvilinearity, it was con- 
sidered advisable to test for this effect. The 
resultant chi squares indicated that in only one 
case was there significant departure from 
rectilinearity and this was in the case of factor 
T (thinking introversion) in the executive 
group, which probably means that too much 
or too little of this trait may be detrimental to 
success in this position. 

In order to determine the effectiveness of 
scores on the traits correlating significantly 
with the criterion in terms of their contribution 
to selection of successful executives and 
supervisors, reference was made to the Taylor- 
Russell tables (6). It was found that for a 
selection ratio of .05, improvement in selection 
of “A” rated executives with use of scores on 
traits S, M, I and Co will be 10%, 11%, 9%, 
and 9% respectively, and of “A” rated 


Table 3 


Point Biserial Correlations Between Scores on Eleven 
Test Variables and the Executive-Supervisor 


Dichotomy 
Variable Point Biserial y 
S — 43 
D -51 
C a 
R AT 
G 123 
A 36 
I 30 
N 18 
o .20 
Ag 12 
Co 34 


Note: Factors S, D and C correlate negatively with 
the criterion because lower mean scores on these traits 
indicate greater amounts of sociability, freedom from 
depression and emotional stability, respectively, 


232 


supervisors improvement from use of scores on 
traits C, N and Co will be 18%, 17%, and 17%, 
respectively. Assuming normality of distribu- 
tion, Taylor-Russell methods are applicable 
because each of these variables shows a 
rectilinear relationship with the criterion. A 
selection ratio of .05 was considered most 
appropriate since only a very small proportion 
of the industrial population is chosen for 
these higher-level jobs. 

The point-biserial correlations are higher 
than are the Pearson 7’s for the separate 
groups. They are, however, lower than they 
might be had the division of the dichotomy 
been more even than the necessary 208-143 
split. The variables used are those in which 
significant differences were found between two 
group means since it is only here that a 
genuine correlation can be assumed to exist. 
It might be possible to establish cutting scores 
for those variables which show point-biserials 
larger than .20. On the basis of the cutting 
scores an applicant might be predicted to fall 
within one or the other of the two groups. 

Certain limiting factors have exerted their 
influence upon the size of the obtained validity 
coefficients. The most significant of these 
limitations is selectiveness of the groups. 
This selection would restrict the variance on 
these trait scores and consequently would 
lower the validity coefficients from those 
which would have been obtained from a more 
heterogeneous population of job applicants. 
Another limiting factor is attenuation due to 
the unreliability of the criterion. If correction 
for attenuation were applied for errors in the 
criterion measure, the validity coefficients 
would probably be about 20% higher. Also, 
since there is no way at this time of determin- 
ing the extent to which responses were falsified, 
this falsification may have served to lower the 
coefficients unless this source of variance is 
also correlated with the criterion, 

What is needed is a set of norms for these 
tests from a large industrial population con- 
sisting not only of executives and supervisors, 
but covering the entire range of personnel, 
In order for these norms to be of much practical 
value, however, some means of overcoming 
possibilities of falsification must be devised. 
As Ruch (5) has suggested, each personnel 
department might perform a trial testing in 


Joan S. Guilford 


which the executive or supervisor is asked 
first to respond to the inventory honestly and 
then again as he thinks the best executive or 
supervisor would respond. He must, of course, 
be assured that his score will in no way affect 
his status. In this way questions could be 
weighted according to the incidence of falsifica- 
tion on the second trial and thus a validating 
key could be devised. This may seem an 
unduly involved and expensive procedure but 
in view of the absence of selective devices for 
these all-important positions, it might well 
be worth the time and expense entailed. 


Summary 


On the basis of the investigation of 208 
executives and 143 supervisors in a large 
chain grocery with regard to temperament 
traits as revealed by scores on the Guilford 
personality inventories, it was found that: 


1. The executive averaged significantly 
more: (1) sociable, (2) free from depression, 
(3) emotionally stable, (4) happy-go-lucky, 
(5) active, (6) ascendant or socially bold, 
(7) self-confident, (8) calm and composed, 
(9) objective, (10) agreeable, and (11) coopera- 
tive than did the supervisor. 

2. On the basis of a criterion of success 
defined as “job performance” ratings, the 
following traits contributed significantly to 
success of the executive: (1) sociability, (2) 
lack of inferiority feelings, (3) cooperativeness, 
and (4) masculinity. The traits contributing 
significantly to the success of the supervisor 
were: (1) emotional stability, (2) calmness and 
composure, and (3) cooperativeness. f 

3. Improvements in prediction of executive 
and supervisor success by means of the Taylor- 
Russell technique were, for the “A” rated ex- 
ecutives for traits S, M, I and Co, 10%, 11%, 
9%, and 9%, respectively, and for the “A” 
rated supervisors for traits C, N, and Co, 18%, 
17% and 17%, respectively. 

4. The obtained validity coefficients are 
lower than they might be because of the 
factors of restricted range of scores due to 
selectiveness of the groups, possibility of 
falsification of responses on the tests, and 
attenuation in the criterion. 

5. Point-biserial correlations of test scores 


Temperament Traits of Executives and Supervisors 233 


Pe and the executive-supervisor dichotomy as References 
| the criterion, reveal higher validity for zoan E Guilford, J. S. The relative value of fourteen test 
| of the trait scores and indicate the possibility variables for predicting success in execulive and 
icti into one or the other category supervisory posilions. Unpublished Master’s 
A prem a a cutting score Thesis, University of Southern California Li- 
5 jd hi brary, 1951. 
6. It is suggested that these results, while 2. Guilford, J. P. An inventory of factors STDCR. 
they agree with previous findings, be regarded Manual of directions and norms (revised). Bev- 
f y ag P $ ; 
as highly tentative until a great deal of suse é Co Sherilan Sappi ee 
: eee e stability of the 3. Guilford, J. P., and Martin, H. » Inventory o 
| invesugavon: confirms th l m a that factors GAMIN. Manual of directions and norms 
obtained validities. It is also proposed | 7 (first revision). Beverly Hills: Sheridan Supply 
some attempt be made to determine the exten Company: 
of falsification by weighting items according 4. Guilford, J. P., and Martin, H. G. The Guilford- 
to the extent to which executives and super- Martin Personnel Inventory. Manual of direc- 
F s . . tions and norms. Beverly Hills: Sheridan Supply 
visors are able to alter os spoke: Gama PPly 
| experimental situation. ere 15 g 2 5. Ruch, F.L. A technique for detecting allem pts to fake 
for norms for these temperament trait Score performance on the self-inventory type of person- 
based on an industrial population since it 1s ality test. Studies in personality. New York: 
Ñ s enerally accepted that success in executive McGraw-Hill Book Co., 1942. 
d supervisory positions is largely dependent 6. Taylor, H. C., and Russell, J. T. The relationship 


nt teristics of validity coefficients to the practical effective- 
upon personality characteristics. ness of tests in selection. J. appl. Psychol., 


Received July 23, 1951. + 1939, 23, 565-578. 


Psychological Test Performance of Steel Industry 
Production Supervisors 


Wesley A. Poe and Irwin A. Berg 


Northwestern University 


In addition to specific knowledge of the jobs 
they direct, supervisors in large industrial 
plants typically are expected to demonstrate 
ability in a number of other areas. Many of 
them, for example, serve as buffers for manage- 
ment when initial questions of collective 
bargaining agreements are raised. They are 
expected to apply intelligently information 
supplied by technical specialists, and they 
give instructions to the employees whom they 
supervise. Further, they must be intimately 
familiar with the flood of management control 
devices in the form of blueprints, charts and 
mimeographed directives. Such broad. job 
demands would require that considerable 
care be exercised in the selection and training 
of supervisors. 

As Lawshe (8) has noted, great emphasis 
has been placed upon the training of supervisors 
but relatively little has been done about their 
selection. Attempts to utilize psychological 
tests for selecting supervisors have met with 
varying success. Shuman (12) found that 
supervisors rated “excellent” in several plants 
could be identified by a battery of three tests. 
Harrell (7) found that the Otis Test of Mental 
Ability had low but useful discriminatory 
value for cotton mill supervisors who were 
rated as “satisfactory” by their superiors. 
Ina study of 44 industries in each of which the 
two “best” and two “poorest” supervisors 
were given the Adaptability Test, Lawshe (8) 
found higher percentages of supervisors appear- 
ing in the “best” category as successively 
higher cut-off scores were used. 

The present study seeks to assess the value of 
certain psychological tests for the purpose of 
identifying “good” production supervisors in 
a large midwestern steel-manufacturing plant. 
Present selection methods in this firm include 
tryouts on the job, interviews with manage- 
ment officials, and informal comments by 
foremen. ` 


Procedure 


There were 40 production supervisors at the 
department head or assistant department head 
level in the steel plant at the time of the present 
investigation. Seven of the 40 were unwilling 
or unable to participate in the study. Thus 
our data are derived from the performance of 
the 33 men who completed all tests in the 
battery described below. These men were 
the line supervisors, the “production bosses” 
who were directly in charge of fabricating the 
steel products sold by their company. All of . 
them supervised the men who actually made the 
product, e.g., workers who ran punch presses, 
shapers, grinders, lathes, forge hammers, and 
the like. The types and scope of plant 
operations under the jurisdiction of the 33 
supervisors were roughly the same, and their 
duties were reasonably comparable. Further, 
all of these men were full-time supervisors. 
Particular emphasis is placed upon the 
comparability of their responsibilities because 
some studies have ignored this important 
aspect. If research findings are to have value 
for other companies engaged in similar work, 
production foremen should not be grouped in a 
statistical potpourri along with supervisors of 
inspection, die-sinking, maintenance, and other 
departments. The present study seeks to 
avoid this pitfall. ; 

A battery of ten tests was used in the study. 
The test administration was preceded by a 
portion of the Otis S. A. Test which was used 
for “warm-up”? purposes. The tests used 
were: Survey of Space Relations (4); California 
Short-Form Test of Mental Maturity (14); 
Adaptability Test (15); Test of Mechanical 
Comprehension Form BB (2); Survey of 
Mechanical Insight (11); The Bernreuter Per- 
sonality Inventory (3); Guilford-Martin Person- 
nel Inventory I (5); Inventory of Factors 
GAMIN (6); Strong’s Vocational Interest Blank 
(13); and Allport-Vernon’s A Study of Values 


234 


= 


Psychological Test Performance of Steel Industry Supervisors 


(1). The entire battery required 7 hours to 
administer and was given in one day. 

Each of the 33 supervisors was rated on his 
success as a supervisor by the plant superin- 
tendent and, independently, by the two 
assistant superintendents. Because the paired- 
comparison technique of rating (9) has purport- 
edly considerably higher reliability than most 
rating methods, this approach was used in the 
form of the Personnel Comparison System (10). 
This procedure employs a booklet composed of 
slips of paper with two names written on each 
slip. The rater simply checks the preferred 
name on each slip. The number of times 
each individual is preferred is then tallied. 
Although there was some disagreement about 
supervisors in the middle positions, the raters 
were unanimous in their agreement on the top 
10 and bottom 10 supervisors. These super- 
visors, representing those rated at the top 30 
per cent and the bottom 30 per cent, were 
designated as the “high” and “low” criterion 
groups. After the criterion groups were 
identified, the test performance of the two 

roups was compared. rs 
° The t test cn used to establish the signif- 
icance of test score mean differences between 
the “high” and “low? rated groups of super- 
visors. F ratios were also computed as a 
check on the randomness and normality of 
the distributions. As scrutiny of the tables 
will disclose, the assumption of homogeneity 
of variance has been met in all but two cases. 


Results 


Tt is difficult to evaluate the effect on test 
performance of the mean age and education 
differences between the “high” and low 
groups of supervisors as shown in Table 1. 
While normal deterioration with age may be 
reflected in some of the test score differences, 
it seems unlikely that the original ratings 
should have been similarly influenced by the 
ages and educational levels of the men rated. 
Such an assumption would require that all 
three of the raters would have adopted similar 
Stereotypes and worn identical “halos” since 
their agreement on which supervisors belonged 
in the top and bottom groups was perfect. 

All of the supervisors in the present study 
were doing a satisfactory job and being paid 


235 


Table 1 
Characteristics of the Subjects of the Criterion Groups 


Group Group 
Characteristic “High” “Low” 
N - 10 10 
Age: 
Mean 45.2 56.1 
S.D: 4.8 S7 
Range 39-53 46-63 
Education (in years): 
Mean 15.6 10.5 
S.D. 1.2 25 
Range 12-16 8-16 
Rating score range 57-77 23-43 


6 to 10 thousand dollars a year for doing it. 
The “high” and “low” groups represented men 
rated as performing in a clearly superior 
fashion, on the one hand, and men doing an 
ordinary but acceptable job on the other. 
Had any of them revealed obvious evidence 
of deterioration they would not have long 
remained in their positions. Further, since 
these men were carefully selected from a large 
group of employees of demonstrated capacity, 
it seems probable that any deficiencies among 
the supervisors studied were not readily 
apparent to the raters or, at least, not con- 
sidered to be important. Thus it is believed 
reasonable to assume that the differences 
between the groups as shown in Table 1 did 
not markedly influence the ratings. 

Whatever the possible influence of age and 
formal education, it seems clear that the 
mental ability tests used can discriminate 
usefully between the groups of supervisors 
studied. The differences between the total- 
score means on both mental ability tests 
(Table 2) were highly significant statistically. 
One part-score, logical reasoning, did not meet 
the assumption of homogeneity of variance. 
Another part-score, spatial relations, failed 
to distinguish acceptably between the two 
groups. By contrast, The Survey of Space 
Relations (Table 3) which presumably measures 
a similar ability in a different way did dis- 
criminate significantly between the “high” and 
“low” groups. The latter test, being longer 
and specifically validated for industrial groups, 
is probably the more appropriate instrument 
for the present investigation. Similarly, the 


236 


Mechanical Comprehension Test (Table 3) also 
distinguished significantly between the two 
groups of supervisors. These findings appear 
reasonable in that higher levels of mechanical 
ability among steel industry supervisors should 
be expected to contribute to a smoother 
running department from a production stand- 
point, at least where machine operations are 
concerned. Presumably, such men would tend 
to be rated higher on job performance. It is 
also probable that such men would be promoted 
sooner; hence the lower mean age of the “high” 
group. 

Any competent observer of supervisors at 
work would agree that personality factors 


Table 2 
Performance on the Mental Ability Tests 


Group Group t 


Test “High” “Low” Ratio F 


Wesley A. Poe and Irwin A. Berg 


Table 3 


Performance on the Mechanical Ability Tests 


California Short-Form 
Test of Mental 


Maturity 

Total Mental Factors 
Mean 101.1 63.0 5.24** 4.04 
Sigma 9.7 19.5 

Language Factors 

Mean 59.8 31.9 5.60** 3.22 
Sigma 73 Ji 
Non-Language Factors 

Mean 41.3 29.6 4.30** 1.07 
Sigma 5.9 S1 

Spatial Relationships 

Mean 26.3 20.6 1.64 1.49 
Sigma 5.0 4.1 

Logical Reasoning 

Mean 21.7 141 2.25* 39.89F 
Sigma 1.6 101 

Numerical Reasoning 

Mean 18.9 94 4.86** 1.05 
Sigma 38 37 

Vocabulary 

Mean 34.2 17.4 4.26** 210 
Sigma 6.7 9.7 

Adaptability Test 

Mean 26.7 18.1 3.75** 4.41 
Sigma 3.0 6.3 


* All differences so marked are signi 5 
per cent level of confidence = 2,10. itate S 
** All differences so marked are signifi 
per cent level of confidence = 2.88, Eao a He ED 
t All differences so marked fail to meet th 
tion of homogeneity of variance. Masi T valeg 
of 4.5 for significance beyond 5 per cent level, 


Group Group t 
Test “High” “Low” Ratio F 
Mechanical Compre- 
hension 
Mean 48.4 36.9 2.23* 2.43 
SD: 8.4 13.1 
Survey of Mechani- 
cal Insight 
Mean 31.0 16.3 4.34** 20,47f 
SD; 2.1 9.5 
Survey of Space 
Relations 
Mean 78.2 Sij 3.54** 1.58 
S.D. 143 18.0 


* Significant at 5 per cent level. 
** Significant at 1 per cent level. 


play an important part in supervisory success. 
The personality and interest tests used in the 
present study were included with the hope that 
some useful differences between the “high” 
and “low” groups would be found. The 
results, presented in Tables 4, 5, 6, 7, and 8, 
are quite disappointing in this respect.! The 
Bernreuler scales B2-S (Self-Sufficiency) and 
F2-S (Sociability) differentiated the groups at 
the 5 per cent level of confidence. Even 
these minimally significant differences are 
weakened since the two scales correlate .60 
according to the manual (3). The Strong 
Vocational Inlerest Blank scores for Aviator 
and the masculinity-femininily scale also dis- 
tinguished between the two supervisory groups 
at the 5 per cent level of significance. How- 
ever, since 42 scales were scored on the 
Vocational Interest Blank, we should expect to 
obtain two scales differentiating at the 5 per 
cent level by chance alone. At any rate, 
further study is clearly indicated before the 
value of the Strong can be assessed for identify- 
ing successful supervisors in the steel industry. 
The Personnel Inventory I, Inventory of Factors 


1 To save space and to reduce printing costs, Tables 
4,5, 6,7, and 8 have been deposited with the American 
Documentation Institute. Order Document 3630 from 
American Documentation Institute, 1719 N Street, 
N.W., Washington, D. C., remitting $1.00 for photo- 
copies (6X8 inches) readable without optical aid or 
$1.00 for microfilm (images 1 inch high on standard 
35 mm. motion picture film). 


—— 


Psychological Test Performance of Steel Industry Supervisors 


GAMIN, and the Study of Values failed to 
yield useful differences between the “high” 
and “tow” groups of supervisors. 

Since both groups of supervisors were 
doing at least a satisfactory job, it is possible 
that no important, measurable differences in 
personality or interest distinguished one group 
from the other, despite the fact that the 
“high” group was rated as doing a much better 
job than the “low” group. On the other hand, 
the tests employed may not have accurately 
reflected any existing differences. One has 
the feeling, if opinion may be permitted, 
that real differences of interest and personality 
do exist between such groups. Perhaps other 
instruments may demonstrate the presence of 
any such supposed differences. 


Summary 


The job success of 33 production supervisors 
in a large steel-manufacturing concern was 
independently rated by the plant superintend- 
ent and two assistant superintendents using a 
paired-comparison rating technique. The 10 
supervisors rated highest and the 10 rated 
lowest were given a 7-hour battery of 10 
tests of intelligence, mechanical ability, per- 
sonality, and interest. Differences in test 
scores between the “high” and “low” groups 
of supervisors were studied. It was found 
that the California Short Form Test of Mental 
Maturity, the Survey of Space Relations, and 
the Adaptability Test differentiated between the 
“high” and “low” supervisors at the 1 per 
cent level of confidence. The Mechanical 
Comprehension Test, two scales on the Bernreuler 
Personality Inventory, and two scales on the 
Strong Vocational Interest Blank distinguished 
between the two groups of supervisors at the 


237 


5 per cent level of significance. None of the 
other tests yielded significant differences. 


Received September 17, 1951. 


References 


1. Allport, G. W., and Vernon, P. E. Manual for the 
study of values. Cambridge: Riverside Press, 
1931. . 

2. Bennett, G. K. Manual for the test of mechanical 
comprehension. New York: Psychological Cor- 
poration. 

3. Bernreuter, R. G. Manual for the personality in- 
ventory. Stanford University: Stanford Univer- 
sity Press, 1935. 

4. Case, H. W., and Ruch, F. Manual for the survey 
of space relations. Los Angeles: California Test 
Bureau. 

5. Guilford, J. P., and Martin, H. G. Manual Jor the 
personnel inventory I. Los Angeles: Sheridan 
Supply Company, 

6. Guilford, J. P., and Martin, H. G. Manual for the 
inventory of factors GAMIN. Los Angeles: 
Sheridan Supply Company. 

7. Harrell, T. W. Testing cotton mill supervisors. 
J. appl. Psychol., 1940, 24, 31-35. 

8. Lawshe, C. H. How can we pick better super- 
visors? Personnel Psychol., 1949, 2, 69-74. 

9. Lawshe, C. H., Kephart, N. C., and McCormick, 
E. J. The paired comparison technique for 
rating performance of industrial employees. J. 
appl. Psychol., 1949, 33, 69-77. 

10. Likert, R. Morale and agency management. Hart- 
ford: Life Insurance Sales Bureau, 1941. 

11. Miller, D. R. Manual for the survey of mechanical 
insight. Los Angeles: California Test Bureau. 

12. Shuman, J. T. The value of aptitude tests for 
supervisory workers in the aircraft engine and 
propeller industries. J. appl. Psychol., 1945, 29, 
185-190. 

13. Strong, E. K. Manual for the vocational interest 
blank. Stanford University: Stanford Univer- 
sity Press. 

14. Sullivan, E. T., Clark, W. W., and Tiegs, W. W. 
Manual for the California short-form test of mental 
maturity, advanced form. Los Angeles: California 
Test Bureau, 1947. 

15. Tiffin, J., and Lawshe, C. H. Preliminary manual 


Jor the adaptability test. Chicago: Science Re- 
search Associates, 1943, 


A Cluster Analysis of 


Office Operations * 


Leon L. Thomas 


Dunlap and Associates, Inc., Stamford, Connecticut 


The identification of components of work is 
an integral step for the investigator in the 
area of Industrial Psychology. One phase in 
the construction of tests for selection and 
placement purposes is the determination of 
groups of similar operations. The develop- 
ment of a training program depends, in part, 
upon the isolation of components of work that 
tend to cut across job lines. Likewise, work 
components contributing to the make-up of a 
job family can serve as a foundation upon 
which a system for rating job incumbents can 
be built. Investigators in these areas are 
constantly seeking methods which will give 
them more information about the fundamental 
nature of the job families under consideration. 

Research by Lawshe and others (4, 5) 
suggests that factorial methods can be success- 
fully applied to job analysis data. They have 
shown that factorial methods can be used to 
identify components when applied to job 
evaluation systems. Tryon (6) has developed 
a technique intended to accomplish results 
similiar to those of the factorial methods in a 
somewhat less mathematically rigorous man- 
ner. The Tryon technique seemed promising 
and a modification was developed for use with 
this particular problem. 

This study, therefore, represents a new 
methodological attack upon the identification 
of components of work. Specifically, the 
purpose is to: (1) identify groups, or clusters, 
of similar elemental operations in a sample of 
office jobs; and (2) develop a modification of 
the Tryon Cluster Analysis technique for 
use with a large number of variables. 


Procedure 


The Check-List. Largely through th 
of Culbertson (1) and Dudek 6), re work 
Description Check-List of Office Operations has 
been constructed for use in the analysis of office 


* This study is based upon a thesis submi A 
partial fulfillment for the degree of Doctor af Phi 
phy, done under the direction of Professor C, H. Lawshe. 
The original data for this study can be found in the 
author’s thesis on file in the Purdue University library. 


238 


jobs.! It consists of a series of 139 basic cleri- 
cal operations. The check-list, for purposes of 
this study, was completed by having each job 
incumbent and his or her immediate supervisor 
check, independently, the duties performed on 
the job. A third party, usually the coordi- 
nator of research within the company, com- 
pared the two and identified any points of 
difference. A conference was then held with 
the incumbent and the supervisor in order to 
reach agreement. 

The Sample. The sample of office jobs used 
in this study was obtained from five different 
companies. One was a foundry located in the 
South, another was an office filing equipment 
manufacturer in the East. A third contributor 
was a member of the automotive industry in 
the Midwest and two were steel mills, one 
located in the Midwest, the other in the East. 

The Selection Criteria. Each of the five com- 
panies was asked to return completed check- 
lists for 25 key jobs. The key jobs were to 
be distributed throughout the entire range at 
present pay rates; were to ‘‘sample’’ the various 
areas of work being performed; and were not 
to be in dispute with regard to pay rates. 
From the 115 check-lists so obtained (the 
foundry was able to supply only 15), 112 were 
used in this study. Three were discarded be- 
cause of insufficient data. 

Limitations of the Sample. It is to be re- 
membered that the results of this study are 
based upon this particular sample of office jobs. 
These companies are engaged in relatively 
different kinds of activity, and the jobs ob- 
tained from them seem to sample Sea 
areas of office operations. It is porte fos 
ever, that a similar analysis made O E nee 
operations obtained froma different population 
might produce different results. 

The Selection of Items. A frequency count 
was made of the number of times each check- 
list item was used in the sample of 112 jobs. 
The frequencies ranged from zero to 98, and 
had a median of 32. Seventy-nine items were 
found to have frequencies of twenty or more. 
These 79 items were the ones used in this 


analysis. 


1 To reduce printing costs the Job Description Check- 
List of Office Operations has been deposited with the 
American Documentation Institute. Order Document 
3267 from American Documentation Institute, 1719 N 
Street, N.W., Washington, D. C., remitting $1.00 for 
microfilm (images 1 inch high on standard 35 mm. 
motion picture film) or $1.20 for photocopies (6 X 8 
inches) readable without optical aid. 


Na 


A Cluster Analysis of Office Operations 


The Correlation Matrix. „A matrix of inter- 
correlation for the 79 selected items was com- 
puted. The standard formula, with certain 
adaptations, for the computation of ġ was used? 
(3). The coefficients of correlation ranged from 
—.25 to +.81. : 

Construction of Tryon Curves. A curve was 
drawn according to the Tryon Method for each 
of the 79 selected items (6). The units along 
the ordinate were numbered from —.25 to 
+.81. The units along the abscissa were num- 
bered from 1 through 79, the number of items 
in the analysis. The intercorrelations for a 
Particular item were plotted by locating a point 
above each of the items along the abscissa at 
a distance corresponding to the size of the 
correlation between the item being plotted and 
each of the other 78. For the correlation of 
each item with itself, the value of the highest 
intercorrelation of that item with any of the 
others was used. The points on each of these 

plots were joined producing a curve indi- 
cating the manner in which a particular item 
Correlated with the other 78. 

Lr ye ad — be 

“= Vatp@roer aot) 
where a = the number of jobs on which both item jand 
item j were checked; b = the number of jobs on w R 
item 7 was checked and item j was not checked; : = the 
number of jobs on which item j was checked an Tra 
Was not checked; and d = the number of jobs on v 
neither item i nor item j was checked. 


239 


In order for distances in the region where 
@ approached + or —1 to be equal to those 
in the region where ¢ approached zero, the 
values along the ordinate were transformed to 
Fisher's s’. Also, each curve was drawn on a 
separate 40” X 24” sheet of tracing paper so 
that the curves.could be superimposed for sub- 
sequent analysis. 

Curve Inspection. In a Tryon Analysis, 
groups or clusters of items are identified by 
isolating groups of congruous curves. In order 
to accomplish this with 79 separate curves, it 
was necessary to establish a starting point, or 
preliminary sets of items. First, a table was 
drawn up listing each of the selected items 
and the three items most highly correlated 
with them. From this table it was possible to 
note groups of from three to five items which 
exhibited a high degree of intercorrelation. 
Sixteen tentative clusters were identified, ac- 
counting for 40 of the 79 curves. 

A viewing box was constructed, 40” X 24’ 
in size with a translucent plastic top lighted 
by seven 60 w. lumiline bulbs. This made it 
possible to superimpose the curves for in- 
spection. 5 

First, the curves for the preliminary sets of 
items were inspected for congruity. On the 
basis of inspection some of the sets were com- 
bined, reducing the number of sets to seven, 
accounting for 40 of the items. Then each of 
the remaining curves was compared with each 


Table 1 
Distribution of Cluster Item and Residual Item Correlations After the First Refinement 
1; 


I Il In 


R R 


VI 


IV Vi 


C R 


R 


C 
1 


NweHIQ 


in 
a 
j 
in 
© 
w 
w oe N o 


NNR 


NNW Dt 
NNU 
N 


-nene 


an e 


E Pere 
Oo m W ee ee 
on tw 
nue 
mam We = 
Roe =. 


Nar 
m m p 
=. 
un 


BPR 


C= Cluster items, 
R = Residual items, 


240 


Leon L. Thomas 


Table 2 


Distribution of Cluster Item and Residual Item Correlations After the Second Refinement 


It II 


IV VI VII 


rarRel|a 
www e 
rPnNnNnrR ND 


ron 


N 


nennen 


Nw m 


YN 


ne 


HNO 
mee 


= 
Be ND 


C = Cluster items. 
R = Residual items. 


of the seven sets and 20 of them added to one 
or the other of the sets. Finally, the nineteen 
remaining curves were compared with one 
another and an eighth set emerged consisting 
of three curves. These eight sets of curves, 
identified by inspection, accounted for 63 of 
the 79 selected items. Sixteen of the curves 
exhibited no apparent congruity with any of 
the eight identified sets or with one another. 

First Refinement. The correlation between 
each of the 63 items, which had been placed in 
one of eight clusters by inspection, and the 
cluster in which it appeared was computed. 
Similarly, the correlation between each of the 
16 residual items and each of the eight clusters 
was computed. Table 1 shows the distribu- 
tion of correlations between the 63 cluster items 
and the cluster in which they appear and the 
distribution of the correlations between the 16 
residual items and each of the eight clusters. 

Second Refinement. It can be seen from 
Table 1 that seven of the residual items corre- 


3 The author is indebted to Mr. P 
Division of Educational Reference, Purdue Univoite, 
for the development of this statistic, s 
Brin 


-= 
Var + 22r; 
where r:r = the correlation of item i with 

3 cluster I; 
Zrin the sum of the correlations of item i with the 
items composing cluster I; 27;;; = the sum of the inter- 
correlations of all the items composing cluster I; 
nı = the number of items composing cluster I, j 


a 


late as highly or more highly with one or the 
other of the eight clusters than some of the 
items appearing in them originally, It was 
decided to include these seven items in the 
clusters. One item was added to cluster III, 
three items to cluster IV, one item to cluster 
AN to cluster VII, and one item to cluster 

After these additions, again the correlations 
between each of the 70 cluster items and the 
cluster in which it appeared were computed: 
Likewise, the correlations between each of the 
nine residual items and each of the eight clus- 
ters were computed. Table 2 shows the dis- 
tribution of these correlations after the secon 
refinement. 

It can be seen from Table 2 that none of the 
correlations between the residual items and the 


Table 3 


Matrix of Cluster Intercorrelation 


Clute I mM wiv ow wW 
w =mi 

TIE —.042 .589 

Iv —.050 .497  .607 

v 263 .428 426 232 

VI 186 .135 .158 .283 .173 

VIE 131 .392 .398 .185 .363 .384 
VI 1097 641 430 285 387 195 30 


A Cluster Analysis of Office Operations 


eight clusters overlapped with those between 
the cluster items and the clusters in which they 
appeared. It was decided, therefore, to stop 
the addition of items after the second refine- 
ment and let the nine remaining items remain 
as residual or “unique” items. 

Cluster Intercorrelations. Table 3 shows the 
intercorrelations of the eight identified clus- 
ters.1 The residual items showed little tend- 
ency to intercorrelate. Their intercorrelations 
ranged from —.14 to +.34, with a mean of Ad 


Results 


Eight clusters, or components of work, 
constituting office operations in a sample of 
112 office jobs were identified. ‘These clusters 
of items may be characterized by the type of 
operations found within them. 

Cluster I (Typing). It seems probable that 
the operations within cluster I group them- 
selves because of the equipment used in the 
performance of the operations. It consists of 
Such items as: ‘Typewrites straight copy from 
correct forms,” “Prepares duplicating machine 
‘stencils’ and/or ‘master copies’ using a type- 
writer,” and “Typewrites straight copy from 
rough draft.” All the items contained in 
cluster I pertain to the use of the typewriter 
e various ways, addressing, making out 
orms, checks, etc. T 

Chanter “Tr (Listing and Conger, 
Cluster II may be described as a cluster o 
listing and compilation activities. The items 
isolated in cluster II pertain to copying, 
checking, listing, and recording activity. It 
Contains such items as: “Compiles lists of 
numerical, verbal, or other descriptive data, 
« repares routine lists of specific items, 
numerical and/or verbal, according p des- 
ignated system by longhand,” and “Copies 
desired data from one form or record into the 
Poper place on another form or record by 
onghand.” 

Cluster TIT (Communication). Cluster II 
may be thought of as information gathering 
and dissemination activity. The items con- 


pi, Lhe author is indebted to Mr. P. C. Baker of the 
ivision of Educational Reference, Purdue University, 

Or the development of this statistic. 

Zrii 

cn i 

where "1,11 = the correlation between cluster I and 

Cluster Tl; Err:r; = the sum of the intercorrelations © 
e items composing cluster I with those composing 


Cluster IT; or = Viz F 23r1qj3 on = Van + rini 


hu = 


241 


tained in this cluster deal with placing and 
answering telephone calls, sending, routing 
and relaying forms, materials, and information 
to proper persons or departments and receiving 
requests, instructions or information. 

Cluster IV (Planning and Supervision). 
Planning and supervisory operations char- 
acterize cluster IV. Isolated in this cluster 
were items such as: “Plans and/or coordinates 
the assignment or execution of duties performed 
by other individuals,” “ Reviews and analyzes 
work of others, calling attention to use of 
incorrect procedures or methods,” and “Com- 
poses correspondence requiring specific knowl- 
edge of methods, procedures, policies, or 
other information.” 

Cluster V (Filing). Cluster V can be 
described as a cluster of filing operations. 
Operations found in this cluster pertain to the 
classification, coding, positioning, location, 
and transference of materials in files. 

Cluster VI (Stock Handling). The phrase 
“stock handling” is appropriate for describing 
cluster VI. Items pertaining to wrapping and 
packaging of articles, removing stock, merchan- 
dise, or other items from special places or 
containers, and marking items with identifica- 
tion and postal information are found in it. 

Cluster VII (Routine Clerical). Cluster 
VII is more difficult to describe. The opera- 
tions in cluster VII seem to pertain to routine, 
low-level, clerical activity. It contains items 
such as: “Counts various items,” “ Assembles 
various forms,” “Fastens together various 
forms, records or other items,” “Cuts or tears 
apart special forms,” and similar activities, 

Cluster VIII (Calculation). As can be seen 
from Table 3, cluster VIII is somewhat highly 
related to cluster II. This degree of relation- 
ship can be explained, at least in part, when 
the items in cluster VIII are compared with 
those of cluster II. Whereas cluster II 
contains several items pertaining to the 
copying, compilation, and listing of numerical 
data, cluster VIII contains items dealing with 
the mathematical manipulation of numerical 
data. It seems reasonable, therefore, that 
cluster VIII be thought of as a “dependent” 
cluster with regard to cluster II, hence, highly 
associated with it. 

Residual Items. Nine of the original 79 
selected items remained as residual items 


242 


after the second refinement. The correlations 
between these items and the eight clusters 
were not high enough for them to be included 
in one or the other of the clusters, and their 
intercorrelations were small. It is tenable, 
therefore, to assume that these residual items 
are concerned with “unique” operations. 

There are two possible explanations for the 
“uniqueness” of the residual items. First, 
it would be possible for a residual item to be 
associated with items that have been placed 
in the identified clusters, but, due to the 
ambiguity of the item, associated in so random 
a fashion that it could not be placed in any 
one cluster. Second, it would be possible for 
a residual item to be a “specific” operation 
requiring 100 per cent of an incumbent’s 
time, or at least so much of an incumbent’s 
time that other operations have little opportu- 
nity to become associated withit. Anexamina- 
tion of the residual items lends further support 
for the latter explanation. The residual items 
pertain to such activities as: “Gives out 
various materials, forms or other items upon 
request of customers, clients or other individ- 
uals,’ Receives merchandise, materials, forms, 
telegrams, mail, or other items which are 
being delivered, or being returned for refund, 
replacement, or other purposes,” and “Sends 
telegrams.” 

Unselected Items. The items used in this 
analysis were selected on the basis of their 
frequency of occurrence in a particular sample 
of office jobs. Only the 79 items that were 
checked at least 20 times were used, leaving 
60 items unselected. The addition of un- 
selected items could affect the analysis in 
three ways. First, additional items could 
correlate highly enough with one or the other of 
the previously identified clusters to be included 
in them. Second, additional items could show 
up as “unique” operations. Third, it would 
te oe 

; of items in the analysis. 
iy ee the operations concerned 
ay or automatic machine 
activity had sufficient frequencies to be 
selected for the study. It is therefore likely 
that had these items been used often ting 
to have been included in the analysis, a ninth, 
or stenographic cluster, and a tenth. or 
automatic machine operation cluster Sold 
have been identified. i 


Leon L. Thomas 


Summary 


1. The purpose of this study was a) to 
identify clusters of similar elemental operations 
in a sample of office jobs, and b) to develop a 
modification of the Tryon Cluster Analysis 
technique for use.with a large number of 
variables. 

2. Using completed Job Description Check- 
Lists of Office Operations from 112 office jobs, 
the intercorrelations for 79 selected check-list 
items were computed. 

3. A modification of the Tryon Cluster 
Analysis technique was developed and used to 
identify eight clusters of office operations 
which may be thought of as the work com- 
ponents of a sample of office jobs. 

4. The clusters, or components of work, 
identified were: a) Typing, b) Listing and 
Compilation, c) Communication, d) Planning 
and Supervision, e) Filing, f) Stock Handling, 
g) Routine Clerical Operations, and h) Cal- 
culation. 

5. A procedural outline for performing & 
cluster analysis with a large number of 
variables was written.® 


Received August 1, 1951. 


References 


1. Culbertson, A. L. The adequacy of an operational 
check-list for the general description of office jobs. 
Unpublished M.S. thesis, Purdue University, 
1947. 

2. Dudek, E. E. An operational approach to the evalua- 
tion of office jobs. Unpublished Ph.D. thesis, 
Purdue University, 1948. a 

3. Guilford, J. P. Fundamental statistics M psychology 
and education. Second edition, New York: 
McGraw-Hill, 1950. ‘aay. 

4. Lawshe, C. H., and Satter, G. A. Studies in job 
evaluation. 1. Factor analysis of point ratings 
for hourly-paid jobs in three industrial plants: 
J. appl. Psychol., 1944, 28, 189-198. 

5. Lawshe, C. H., Dudek, E. E., and Wilson, R. F. 
Studies in job evaluation. 7. A factor analysis 
of two point rating methods of job evaluation: 
J. appl. Psychol., 1948, 32, 118-129. 

6. Tryon, R. C. Cluster analysis, Berkeley, Calif. 
Associated Students Store, University of Cali- 
fornia, 1930. (Lithoprinted.) 

pares: Se 

5 To reduce printing costs A Cluster Analysis of ofice 

Operations has been deposited with the American be 

mentation Institute. Order Document 3627 from 

American Documentation Institute, 1719 N Street 

N.W., Washington, D. C., remitting $1.00 for photo 

copies (6 X 8 inches) readable without optical aid pi 

$1.00 for microfilm (images 1 inch high on stan ar 

35 mm. motion picture film). 


A Factor Analysis of a Salary Job Evaluation Plan * 


Allen H. Howard and Howard G. Schutz 
Department of Psychology and Education, Ilinois Institute of Technology 


The purpose of this study is to analyze 
factorially the operation of a point system of 
job evaluation as used in rating salary or 
office-type jobs in a manufacturing plant, and 
to determine the basic factors involved in the 
functioning of the system and their significance 
in determining labor grade level. Previous 
similar or related studies by Lawshe and 
associates (2, 3, 4, 5, 6), Rogers (7), and 
others have indicated consistently that regard- 
less of the type of job evaluation plan used, 
the number of elements rated, or other specific 
features of the installation, job grade level 
seems to be determined by the operation of a 
very limited number of factors, one of these 
often accounting for 95 per cent or more of the 
labor grade variance. These studies have 
demonstrated that in many job rage 
plans the component rating scales ‘| w 
provide independent measurements, an a 
ther, that certain of the variables are no 
significantly related to the combined job rating. 


i tunity to 
techniques the oppor 
this analysis and to Dr. P. S. 


ee ee ee ading of the manuscript. 


urrager for his critical re: 


Data and Procedure 


The present study is based on data obtained 
in a large electrical manufacturing plant where 
grading of non-supervisory, office-type jobs by 
means of a grading plan similar to the NEMA 
salary plan has been going on for a dozen or 
more years. Job rating is performed with the 
cooperation of line supervision by carefully 
trained analysts, whose work is rigorously 
checked by their own supervisors. 

The essentials of the grading plan are pre- 
sented in Table 1 with the 12 job attributes 
classified in terms of the a priori Skill, Respon- 
sibility, and Job Conditions attribute categories 
established by the authors of the plan. Of the 
11 scales having more than two divisions four 
(Nos. 1, 2, 3, and 12) are divided arithmetically; 
the remaining seven are divided irregularly 
with larger steps at the high ends. 

The general characteristics of most of the 
attributes are suggested by their respective 
titles, but several require additional comment. 
Attribute 8, Responsibility for Service Relations, 
is concerned essentially with responsibility for 
(a) relations with customers and (b) the secu- 
rity of confidential personnel information. 
Attribute 9, Responsibility for Work of Others, 
“relates to non-supervisory responsibility for 
instructing other employees, assigning work to 
them, coordinating their efforts, and maintain- 
ing the flow of work within a group.” Attribute 
10, Business Relations, is “a measure of the 
extent to which the job necessitates meeting, 


Table 1 


Summary of J 


Evaluation Plan Showing Total Point Range and Number of 
Scale Divisions for Each Attribute 


„Total Nitad 
Attribute Point Range Scale Divisions 
Skill 1. Education : s : 
2. Experience 5 
3. Analytical Ability 45 $ 
4. Judgment 8 5 A 
5. Ingenuity 4 
6. Expression 35 A 
ibili ibility for Monetary Loss 50 3 
Pe Responsibility for Service Relations 50 i 
9. Responsibility for Work of Others $ 4 
10. Business Relations PA 2 
11. Supervision Received 35 A 
Job Conditions 12. Working Conditions and Physical Demand 30 7 


243 


244 


dealing with, and influencing other people.” 
Attribute 5, Ingenuity, is defined as inventive 
ability “involving the origination of new de- 
signs, processes, manufacturing or business 
methods, or adapting of existing designs, proc- 
esses, manufacturing or business methods, to 
new or changed situations.” Ingenuity has 
been a little used scale in the evaluation of the 
vast majority of office-type jobs in this com- 
pany. In fact only 11 of the sample of 200 
evaluations used in the present study received 
the higher of the two possible ratings of this 
attribute. Since under these conditions any 
correlation of Ingenuity ratings with other 
attribute ratings would be of negligible signifi- 
cance, this attribute was not included in the 
computations to be described. 

Data consisted of 200 evaluations represent- 
ing a wide range of jobs at all grade levels, 
including routine clerical occupations as well 
as more highly specialized jobs in such areas 
as purchasing, accounting, personnel, drafting, 
inspection, and production control. 


Results 


Eleven attribute ratings and job grade were 
intercorrelated by the product-moment method 
and the resulting matrix, which is presented in 
Table 2, factor analyzed by Thurstone’s 
centroid technique. After the extraction of 
one factor several criteria indicated that 


Allen H. Howard and Howard G. Schulz 


additional factors would probably be the 
result of chance variance only. An attempted 
extraction of a second factor gave a meaningless 
result after as well as before rotation. 

As indicated in Table 3, the solitary group 
factor determined by the factor analysis is 
common in some degree to all eleven of the 
attributes included in the study. It accounts 
for a major part of the variance in eight of 
them (Nos. 1 through 7, 10, and 11) and for 
approximately 99 per cent of the variance in 
job grade level. The magnitude of the latter 
relationship is a function of the relatively 
heavy weights assigned in the grading plan to 
these attributes, and of the greater variability 
of these attributes as compared with the others. 

Those attributes heavily saturated with this 
factor represent elements which should be 
expected to vary directly with the general 
skill and ability required by the job. The 
factor, then, is similar to the primary “Skill 
Demands” factor of Lawshe and others and 
has been so designated. Three of the attri- 
butes, Nos. 8, 9, and 12, are only slightly 
weighted with the “Skill Demands” factor. 
In the correlation matrix these attributes show 
relatively little relation to each other, to job 


Table 2 


Intercorrelations of Attribute Ratings and Job Grade 


v 
BY a g 5 oe 
= £ & 3 2E 
Z 2 £ g 8s g 3A 
i z s T 32 a ^ ga 
Attribute Y = si = 3 5 2 5 0.8 2 
5s 8 € 38 cm a a ù wb 3 
= gs 8 B 8 g © ¢ Ẹ S£ Pc 
e] D 5 2 v & ae = 5 Sr 
F E 2 A 5 2 5 E É „© p 2 
a a 2 39 = 3 z a A Sa A 
O 6 t+ CS KR a Ss gd 9 
: cio 76 78 74 71 55 16 32 68 69 —12 86 
REE a 85 8 072 69 17 40 77 16 —12 94 
3. Analytical Ability 81 72 67 15 29 78 74 ST ‘92 
pe 30o o 09 35 ST 8 -17 9 
6. Expression pi 13 3i ‘32 8i 35 "86 
7. Monetary Loss er w H A eat 70 
8. Service Relations é “Of I7 2 Be ‘93 
9. Work of Others pa 6 —be “40 
10. Business Relations 5 n er a 
11. Supervision Received i = a 
12. Working Conditions and 4 
Physical Demand 412 


oe 


A Factor Analysis of a Salary Job Evaluation Plan 245 


Table 3 
. Factor Loading for Each Attribute and Job Grade 


Factor 
Attribute Loading 
Skill 1, Education 82 
2. Experience 90 
3. Analytical Ability 89 
4. Judgment 91 
6. Expression 88 
a ibility 7. Responsibility for Monetary Loss 15 
nn 8. Responsibility for Service Relations 21 
9. Responsibility for Work of Others .38 
10. Business Relations “88 
11. Supervision Received 88 
Job Conditions 12. Working Conditions and Physical Demand = 27 
9 
Job Grade 99-4 


grade, or to the other eight attributes, and it 
reasonably can be assumed that they mar 
as relatively independent and quite specific 
variables, although of little significance in 
determining job grade. Jf each one were 
represented in the matrix more than onh 
they should come out in the analysis (if base 
on a sufficiently large sample of evaluations) 
as a series of additional group factors. 

As stated previously, Attribute 8, eae 
ibility for Service Relations, 38 alas 
essentially with responsibility for (a) re 3 ions 
with customers and (b) the security ot con 
fidential personnel information. That ne 
two aspects of Attribute 8 may vary independ- 
ently is perhaps a partial explanation o i 
Paucity of correlation between this attri J e 
the others, and job grade. However, i i 
reasonable to believe that while the two 
aspects of Responsibility Jor Service ‘nes 
May vary to some extent inde a 
each other, they also vary independen y t 
the other attributes; that is to say, they 
represent real and objective Job demene, 
which can be judged to a considerable exten 
as matters of fact without relation to the 
“Skill” and other variables. , 5 

A study of the functioning of Attribute 7, 
Responsibility for Work of Others, suggests 
that this attribute need not be expected to show 
a very close relation to the general level of 


skill or to any specific attribute, because 
practically all jobs may carry this respons- 
ibility in any degree. Thus a job demanding 
only little more than the minimum of skill and 
ability may involve partial responsibility for 
directing the efforts of a helper, or even of 
coordinating the efforts of a group of co- 
workers; as, for example, a typist required to 
coordinate the flow of work within a group of 
which she is a member. 

Like Responsibility for Service Relations 
Attribute 12, Working Conditions and Physical 
Demand, is made up of what are probably two 
partially independent variables, (a) working 
conditions and (b) physical demand. Here 
again the independent variation of these two 
elements may serve to reduce the correlation 
between this attribute, the others, and job 
grade. The negative correlation between the 
Working Conditions and Physical Demand 
attribute and the other elements in the matrix 
accounts for its negative loading with the 
“Skill Demands” factor. In other words, 
the negative loading is the result of a tendency 
for those jobs representing the higher skill 
levels to require lesser amounts of visual 
demand and physical effort and to be performed 
in less objectionable surroundings than jobs 
at the lower skill levels. A similar relationship 
has been reported by other investigators 


(1, 3, 4). 


246 


Summary 


The foregoing analysis suggests that as a 
measuring instrument the job evaluation plan 
functions in a manner quite different from 
that probably intended by its authors. A 
single factor corresponding to Lawshe’s “Skill 
Demands” accounts for a major portion of 
the variance in eight of the attributes and for 
approximately 99 per cent of the variance in 
job grade. The remaining attributes, although 
of little significance in accounting for variance 
in job grade, seem to represent job elements 
which function independently of the “Skill” 
factor. Thus many of the variables in the 
plan are not being evaluated independently, 
and the arbitrary weights which have been 
assigned to others do not give any realistic 
indication as to the degree of their association 
with job grade. 


Received August 19, 1951. 


~ 


Allen H. Howard and Howard G. Schutz 


References 


. Chesler, D. J. Reliability and comparability of 
different job evaluation systems. J. appl. Psy- 
chol., 1948, 32, 465-475. 

. Lawshe, C. H., Jr., and Satter, G. A. Studies in 
job evaluation. 1. Factor analysis of point rat- 
ings for hourly-paid jobs in three industrial 
plants. J. appl. Psychol., 1944, 28, 189-198. 

. Lawshe, C. H., Jr., and Maleski, A. A. Studies in 
job evaluation. 3. An analysis of point ratings 
for salary-paid jobs in an industrial plant. J. 
appl. Psychol., 1946, 30, 117-128. 


. Lawshe, C. H., Jr., and Alessi, S. L. Studies in 


job evaluation. 4. Analysis of another point 
rating scale for hourly-paid jobs and the ade- 
quacy of an abbreviated scale. J. appl. Psy- 
chol., 1946, 30, 310-319. 


. Lawshe, C. H., Jr., and Wilson, R. F. Studies in 


job evaluation. 5. An analysis of the factor 
comparison system as it functions in a paper mill. 
J. appl. Psychol., 1946, 30, 426-434. 

. Lawshe, C. H., Jr., Dudek, E. E., and Wilson, R. F. 
Studies in job evaluation. 7. A factor analysis 
of two point rating methods of job evaluation. 
J. appl. Psychol., 1948, 32, 118-129. 

. Rogers, R. C. Analysis of two point-rating job 
evaluation plans. J. appl. Psychol., 1946, 30, 
579-585. 


A Factor Analysis of Employee Attitudes * 


Robert J. Schreiber, Robert G. Smith, Jr., and Thomas W. Harrell 
University of Illinois 


There is a lack of factorial studies on 
employee attitudes such as have been per- 
formed in the field of social attitudes (1, 3, 4, 
5, 8, 11, 14). Goodwin Watson (15, p. 360) 
has presented an extensive list of attitude 
factors gained from employee verbalizations, 
and Strong (12, pp. 529-530) has presented a 
list of motivations that have been compiled 
from the existing literature. dae 

What are the basic factors in employee 
attitudes? Ghiselli and Brown (7, p. 435) 
suggest that they are attitudes toward manage- 
ment and the working situation, and beliefs 
as to what constitutes company policy. 
While it is possible to hypothesize various 
factors, the hypotheses will, naturally, vary 
somewhat with the individual who hypoth- 
esizes. One method of solution is factor 
analysis. ` 

Hull and Kolstad (10, p. 349) attempt i 
name factors by applying social pn 
factors to employee attitudes. The vali na 
of this remains in doubt until we have definitely 
ascertained that social attitudes and employee 
attitudes are interrelated. 


The Attitude Variables and the Sample 


During the academic year 1947-1948, the 
attitudes of the non-academic ali Ka 
the University of Illinois were dl fe ae 
the interview questionnaire method. ioe, 
Survey covered a random sample of over : A 
of the non-academic employees at the four 
campuses: the Galesburg undergraduate cam- 


pus, the Navy Pier undergraduate ete 
Chicago, the Urbana campus, and the ee 
sity of Illinois Medical Center in Chicago. 
The results are reported in (6) and (9). 

Five broad areas of job satisfaction were 
included in the survey, one of the purposes 


* The present paper is a summary and an extension 
of the Masters Thesis by Schreiber, now ee 
and Associates, Inc., New York, N. Y. They E 
Supervised by Harrell. Smith had charge © ekoi this 
tion of axes and assisted in writing an early cre sexe 
Paper. The University of Illinois Graduate Re Race 
Board gave financial support for statistical assis 


m carrying out the analysis. 


of which was to evaluate the effectiveness of 
various personnel policies in promoting job 
satisfaction. These a priori classifications 
were: (1) adjustment to job; (2) supervision; 
(3) participation-expression; (4) working condi- 
tions; and (5) incentives. 

Since supervisory personnel are often identi- 
fied with management by the rank-and-file 
worker, and since differences in attitudes of 
supervisory and non-supervisory employees 
have been found (10, p. 358), only the latter 
group are considered in this study. The size 
of the sample is 379. 


Intercorrelations of the Variables 


The responses were given weights of one or 
zero. In general, ones represent positive 
responses, while zeros are negative. There 
was a small percentage of No Answer responses, 
in no case representing more than 6% of 
responses to any one question. The No 
Answer responses were considered with regard 
to their meaning for each question, and weights 
were decided by two judges. These weights 
are indicated in the questionnaire. 

The weights were so arranged as to dichot- 
omize the responses as close to the median as 
possible and were inspected by the judges to 
insure that clearly negative responses were not 
classified as positive, or vice versa. 

Computing diagrams for tetrachoric correla- 
tions (2) were used to determine the coeffi- 
cients. At least two determinations, and if 
possible, more, were made, and the mean 
value used in an effort to increase the reliability 
of the correlation. 

Over half of the inter-item correlations lie 
between .10 and .30, While these are not 
high, they are useful for indicating trends. 

The mean intercorrelations between the a 
priori classes of items are shown in Table 1. 
It should be noted that the standard error of a 
tetrachoric r of 0.00 where W=379 is .07. 
Hence, any y larger than .18 is significantly 
different from 0.00 at the 5% limit. 

It is noteworthy that the correlations 


247 


248 Robert J. Schreiber, Robert G. Smith, Jr., and Thomas W Harrell 


Table 1 


Mean Intercorrelations Between Groups of Items 


T II* wI* Iv* y* 


Te 
peg 22 

TI* suis 30 

Iv* .23 24 25 
v* -10 13 02 16. 


*I, Adjustment to Job: Items 1-4; II. Supervision: 
Items 5-10; III. Participation-Expression: Items 11- 
14; IV. Working Conditions and Facilities: Items 15- 
17; and V. Incentives: Items 18-28. 


between Area V, Incentives, and the other 
areas are lower than the correlations between 
any of the remaining areas. Inspection of 
the item intercorrelations indicates that this 
is due to items 24-27, items which are more 
informational than attitudinal. This pre- 
cludes many conclusions that might be drawn 
about the relationship of Area V with other 


areas. From the evaluation of the inter- 
correlations of the other broad areas, and 
consideration of the number of positive, and 
negative r’s affecting the mean, it is possible to 
draw some conclusions: 


1. There is a small negative relationship 
between satisfaction with participation-expres- 
sion and knowledge of University policies. 

2. Satisfaction with supervision and satisfac- 
tion with participation-expression are posi- 
tively related. 

3. The quality of supervision is positively 
related to satisfaction on the job. 

From an examination of some of the higher 
correlations between individual items (r>.35), 
it is possible to conclude that: 

1. There is an indication that women 
employed by the University more often than 
men say that they know what is expected of 
them on the job when they begin work here. 

2. Persons with greater amounts of educa- 


Table 2 


Loadings on Factor I, Job Satisfaction 


Loading Item 
71 5. Do you feel free to speak to your foreman or supervisor and ask him for help or an explanation? 
66 9. When you have done a good job are you ever given credit or complimented? 
64 11. How much dissatisfaction do you think there is among workers in your department? 
63 16. How friendly and cooperative is the feeling between the people with whom you work? 
60 3. To what extent did you know what was expected of you on the job when you first began to 
work here? 
= 18. Do yeu think you will be able to hold your present job as long as you want to? 
es 13. Do you feel free to express your “gripes” and dissatisfactions to higher authority? i 
5 bd How much freedom do you feel that your supervisor or foreman allows you in doing your job? 


pe eae F 
breaking in” period easier? 


42 7 
38 1. How much were you 
versity? 
38 2. To what extent did you know 


work here? 


38 15. How do your general workin 


38 22. To what extent, if any, 


z . Are you always informed when there are changes in rules which affect you? 
12. What is the most important grievance that employees have? (lack of grievances) 
- Satisfaction with tools and equipment. i 
3 Do you feel that good, hard work will pay off in larger opportunity or pay? 
- Did the instructions first given to you as to your job and duties help much in making the 


. Is 5 . , 
=e rp ened always able to assist or explain details of your job? 
r m i: ài sos 
ell does your supervisor know University policies, rules, or procedures? 
able to use your past experience or training when first hired by the Uni- 


what was expected of you on the job when you first began to 


. How much h; å 
; B ave you to Say about the way your work should be done? (May you offer sugges- 
tions or change a way of performing a task?) 


g conditions compare wi er pla i veh 
37 20. How much does your particular job allow pare with other places in the city? 


growth or development of your knowledge or skill? 


ar i6 n s 
€ opportunities provided for promotion and advancement? 


A Factor Analysis of Employee Altitudes 249 
Table 3 
Loadings on Factor II, Knowledge of Employee Benefits 
d « 
Loading, Item 
71 26. Under the University policy, what provisions are made for your beneficiary in case of your death? 
70 25. How much time off with pay is allowed each year for extended sick leave? 
68° 34, Member of retirement system. . ; 
67 27. What is the University plan for life income following retirement from actual service? 
63 31. Years of service. : : 
58 28. What do you think of the University’s attitude toward the unions? 
50 24. How much time off with pay is allowed each year for sick leave? 
=48 33. Marital status (single). 
— 66 30. Sex (male). 
—09 29. Age (below 30). 
54 32. Educational background (not over 8 years). 


tion tend to feel that their jobs will allow them 
more growth and development of their skills 


than do persons with less education. f 

3. Persons who are members of the retire- 
ment system more often than not see little 
Opportunity for advancement m their jobs. 

4. There is a moderate correlation between 
thinking that there is opportunity for advance- 
ment and thinking that hard work will pay 
dividends. 

5. Another moderate corre! 
tween having freedom in doing | 
freedom in expressing dissatisfactions. ; 

6. Having satisfaction with tools and equip- 
ment is related to lack of grievances. 


lation exists be- 
the job and 


Factor Analysis 


The intercorrelations between variables were 
factor-analyzed by the Complete Cate 
Method of Thurstone (13, p- 161H). The 
unknown communalities were estimated by 
inserting the highest coefficient m each 
column into the diagonal cell of that column. 
The communalities were re-estimated after 


the extraction of each factor. 


plete tables of 
r with com- 
osited with 


. | The final questionnaire and the com 
titercorrelations of the varanna leg 
Plete factor analysis tables, have bee N 
the American Documentation Institute. p nesk; ae 
Table 4, Table of Intercorrelations; Table 5, Toile 
Pactor Matrix; Table 6, Transformation Matriz: ji ie 
, Directional Cosines of the Factors; Table 8, rE 
Factor Matrix V. Order Document No. E hoa 

merican Documentation Institute, 1719 N eae 
N.W., Washington 6, D. C., remitting $1.00 for m non 
film (images 1 inch high on standard 35 mm. Mere 
picture film) or $1.35 for photocopies (6 X 8 inches, 
readable without optical aid. 


Eight sets of factor loadings were rotated 
blindly, once orthogonally, and once obliquely, 
to simple structure. After rotation it appeared 
that eight factors were undoubtedly too many, 
since clear interpretation could be given to 
only two. 


Interpretation of the Factors 


It was decided to use only variables with 
loadings of .40 or higher. However, since 
the factor loadings are actually significant 
only to the first digit, this criterion has been 
used with that in mind. 

The first factor has been called “Job 
Satisfaction.” The variables it loads .35 or 
higher are listed in Table 2. 

The variables which this factor loads 
significantly are those which both previous 
studies and a priori considerations would 
indicate made for satisfaction with the job if 
a sufficient number of these are answered 
favorably. The importance of good super- 
vision is underscored, since about half of the 
variables directly concern supervision. There 
are, however, enough variables dealing with 
other aspects of job satisfaction to warrant 
not calling this factor “Supervision.” This is 
a factor of wide scope, since the only variables 
not significantly loaded with this factor are 
those dealing with knowledge of personnel 
policies and personal information items. 

There is one variable, which, while dealing 
with an aspect of satisfaction with work, has 
a factor loading greater than zero, but not 
high enough to be used in interpretation of 


250 


Factor I. This is Item 19, which deals with 
satisfaction with pay. This variable has a 
low communality. Some other items which 
were used in interpretation of the factor also 
have low communalities. It is possible, then, 
that in addition to this factor of wide scope, 
there may be other factors dealing with 
narrower aspects of job satisfaction, such as 
pay, working conditions, and so on which do 
not appear in this analysis. Further research 
is therefore indicated in order to include 
sufficient variables to define these hypothesized 
factors. 

Factor II has been called “Knowledge of 
Employee Benefits,” since the greater number 
of items are concerned with information 
concerning personnel policies which are related 
to benefits. The variables which identify this 
factor are indicated in Table 3. 

It seems quite logical that members of the 

retirement system should be familiar with 
this information. The negative loading of 
age on this factor is an artifact, for two 
reasons. First, a positive weight was given 
those below 30, and a negative weight those 
above 30. In addition, all employees over 30 
are automatically members of the retirement 
system, unless physically or otherwise dis- 
qualified. It is reasonable to expect women 
to have more information, since most of the 
women are clerical workers and the University’s 
Clerical Council furnishes some information 
in this line that other workers lack. It is 
reasonable also to expect employees who are 
married and those of long service to have an 
interest in, and therefore know more concern- 
ing, | policies affecting their security. One 
loading is somewhat anomalous. That is the 
tendency for workers with smaller amounts 
of education to know more concerning em- 
ployee benefits than workers of a higher 
educational level. The group with the highest 
rate of turnover is the food service employees, 
and these are all of low educational level. A 
possible explanation is that the low education 
employees are also relatively low in pay, and 
therefore have fewer personal resources, These 
workers must then fall back on benefits 
guaranteed by the University. This has 
been corroborated by personal interviews, 


Robert J. Schreiber, Robert G. Smith, Jr., and Thomas W. Harrell 


Summary 


The responses of 379 non-supervisory non- 
academic employees of the University of 
Illinois to 34 questionnaire items were inter- 
correlated and factor-analyzed. Of eight 
factors rotated, only the first two were clearly 
defined. These were called “ Job Satisfaction” 
and “Knowledge of Employee Benefits.” 


Received September 27, 1951. 


References 


1. Carlson, H. B. Attitudes of undergraduate stu- 
dents. J. soc. Psychol., 1934, 5, 202-213. 

2. Chesire, L., Saffir, N., and Thurstone, L. L. Com- 
puling diagrams for the letrachoric correlation 
coefficient. Chicago: Univ. of Chicago Book- 
store, 1933. 

3. Ferguson, L. W. Primary social attitudes. Sra 
Psychol., 1939, 8, 217-223. 

4. Ferguson, L. W. The stability of the primary 
social attitudes: I. Religionism and humanitari- 
anism. J. Psychol., 1941, 12, 283-288. 

5. Ford, R. N., and Henderson, D. E. V. A multiple 
factor analysis of Ford’s White-Negro Experi- 
ence Scales. Soc. Forces, 1942, 21, 28-34. 

6. Ford, S. M. Altitudes of non-academic employees, 
University of Illinois. Unpublished Master’s 
Thesis, University of Illinois, 1948. r 

7. Ghiselli, E. E., and Brown, C. W. Personnel and 
industrial psychology. New York: McGraw- 
Hill, 1948. 

8. Hayes, S. P., Jr, The interrelations of political 
attitudes: III. General factors in political issues. 
J. soc. Psychol., 1939, 10, 379-398. 

9. Heist, Arlene, and Heist, P. An altitude survey of 
the nonacademic personnel of the University of 
Illinois, 1948. Unpublished report on file at 
the Office of Nonacademic Personnel, University 
of Illinois, Urbana, Illinois. 7 

10. Hull, R. L., and Kolstad, A. Morale on the job. 
In Watson, G., Civilian morale. New York: 
Reynal and Hitchcock, 1942. 

11. Stagner, R., and Katzoff, E. T. Fascist attitudes: 
Factor analysis of item correlations. J. soc. 
Psychol., 1942, 16, 3-9. 

12. Strong, E. K. Psychological aspecis of business. 
New York: McGraw-Hill, 1938. 

13. Thurstone, L. L. M ulliple factor analysis. 
cago: Univ. of Chicago Press, 1947, 

14, Vetter, G. B. What makes attitudes and opinions 
“liberal” or “conservative”? J. abnorm. $06 
Psychol., 1947, 42, 125-130. 

15, Watson, Goodwin (Ed.). Civilian morale. New 
York: Reynal and Hitchcock, 1942. 


Chi- 


A Factor Analysis of Terman and Miles’ M-F Test 


C. Fenton Ford, Jr.* and Leona E. Tyler 


University of Oregon 


No area in differential psychology has been 
explored more extensively than that of sex 
differences. Large and significant differences 
between many differently selected samples of 
males and females have been obtained using 
tests of mechanical and clerical aptitude, 
Personality questionnaires, and interest ìn- 
Ventories. The most thorough single study 
was that of Terman and Milest Their 
Altitude-Interest Analysis Blank (M-F Test) is 
a questionnaire made up entirely of items 
upon which the sexes differ significantly. The 
total score for an individual gives a masculinity 
or femininity rating which can be compared 
With norms for a wide variety of groups of men 
and women. e 

One question needing to be answered with 
regard to this inventory and others of similar 
design has to do with the possible multi- 
dimensionality of the “masculinity” or fera- 
ininity” jt measures. Is “masculinity @ 
Unitary trait or a composite of several? p 

It is this question to which the present study 
Was directed. The subjects were ninth grade 
Students attending junior high schools in 
Eugene, Oregon, 153 girls and 157 boys. 

lasses to be tested were selected at the 
Convenience of the teaching stafts of the schools 
involved, and there is no reason to Suppose that 
they are not representative of the pee 
Population of that grade. The Terman an 

iles Attitude-Interes! Analysis Blank was 
given and scored as directed in the authors 
Manual. In order to furnish a larger inmhe 
of part scores and avoid heterogeneity O. 
Content within any subtest, scores were 
grouped into fourteen subscores rather than 
‘the seven indicated on the blank. They were 
as follows: (1) Word Association; (2) Ink Blot 

* Mr. Ford di toast er 5, 1950, 
abont toraend ata completing s study, Åt bas 
en summarized for publication by Leona E. Tyler. 

he unpublished M.S. thesis in which the results were 
gy ported is available through the library of the Univer- 
ity of Oregon. 

Terman, L. M., and Miles, Catherine C. Sex and 


€rsonality: Studies in masculinity and femininity. New 
ork: McGraw-Hill, 1936. 


Identification; (3) Information; (4) Things 
That Arouse Anger; (5) Things That Arouse 
Fear; (6) Things That Arouse Disgust; (7) 
Things That Arouse Pity; (8) Ethical Atti- 
tudes; (9) Interests; (10) Book Likes; (11) 
Activity Preferences; (12) Famous Persons; 
(13) Opinions; and (14) Introvertive Response. 

Split-half reliability coefficients and per- 
centages of overlapping for the two sex groups 
were calculated for all the subtests separately. 
Exercises 2, Ink Blot Identification, and 13, 
Opinions, were omitted from further computa- 
tions because of low reliability and poor 
discriminating power. 

Separate correlation matrices were developed 
for the two sex groups and factor analyzed, 
using Thurstone’s centroid method with or- 
thogonal rotation to simple structure. For the 
masculinity matrix, four successive repetitions 
of the analysis were made to stabilize the 
communalities. For the femininity matrix 
three repetitions were sufficient. 

For the boys, the result of the analysis is 
given by the rotated factor matrix included 
in Table 1. 

The first factor seems to be an emotional 
characteristic which might be named “Tough- 
ness” or “Insensitivity” since the kind of 
response scoring masculine on the subtests in 
question characterizes a person not easily 
moved to anger, disgust, or pity. The second 
factor is obviously an interest factor. Just 
why Exercises 3 and 12 show zero loadings on 
both factors after rotation cannot be ascer- 
tained from the data. They perhaps represent 
further factors not found to any extent in 
any other subtests included in the battery. 
The adjusted communalities for both tests 
were practically zero. 

The rotated factor matrix for the girls is 
shown in Table 2. 

For girls also the first factor would seem to 
be of an emotional nature and can be labeled 
“Sensitivity” since the feminine responses to 
items in the subtests involved are in the direc- 
tion of arousal of the emotions. Factor IT 


251 


C. Fenton Ford, Jr. 


, and Leona E. Tyler 


Table 1 
Centroid and Rotated Factor Loadings, Masculinity, N = 157 Boys 


Centroid Rotated* 

eens een eee Piian 

Factor Factor Factor Factor 
Exercise I Il qT I 
1. Word Association .23 —.13 1 2 
3. Information —.06 02 0 0 
4. Anger 51 34 6 0 
5. Fear 29 19 3 0 
6. Disgust 74 36 8 1 
7. Pity .58 31 6 0 
8. Ethical Attitudes 70 22 a 2 
9. Interests -50 —.37 2 6 
10. Books 39 = 49 0 6 
11. Activity Preference Al —.53 0 mi 
12. Famous People —.02 — 04 0 0 
14. Introvertive Response 37 08 3 FI 


* Angle of rotation = 32° 07’. 


is again primarily an interest factor. Factor 
III is more difficult to interpret, but in the 
light of Terman and Miles’ description of 
the nature of characteristically feminine re- 
sponses to subtests 1, 4, and 9, it appears to 
be primarily a social role factor reflecting a 
girl’s awareness of the part she is expected to 
play in our culture. Why Exercise 5, Things 


That Arouse Fear, should have a negative 
loading is not clear, unless some concept of 
bravery, normally thought to be a masculine 
trait, enters into the stereotype of the ideal 
woman. In general, Exercise 5 gives some- 
what inconsistent results in this study, with a 
factor loading in the masculinity matrix 
considerably lower than the other subtests 0” 


Table 2 
Centroid and Rotated Factor Loadings, Femininity, N = 153 Girls 


Centroid Rotated* 

, Factor Factor Factor Factor Factor Factor 

Exercise I II TII I Il il 

1. Word Association 40 —.35 53 A 0 8 
3. Information 22 04 All 2 0 2 
4, Anger 70 14 42 6 -1 6 
5. Fear 33 Al =a AS $ 3 —.4 
6. Disgust 52 35 = (68 6 aA 0 
7. Pity s 39 40 .07 6 =< 0 
8. Ethical Attitudes 51 19 00 5 a A 
9. Interests 45 —.38 —.06 ofl A 4 
10. Books a7 sill —.26 0 5 A 
11. Activity Preferences AL —53 = 21 0 6 3 
12. Famous People 27 03 .05 2 1 2 
14. Introvertive Response 24 05 — 13 i 0 


* Angle of first rotation (Planes I-II) = 41° 40’; and angle of second rotation (Planes II-III) = 43° 22". 


az 


„1 Factor Analysis of Terman and Miles’ M-F Test 253 


emotional characteristics, and an appreciable 
loading on Factor II as well as Factor I in 
the femininity matrix. 

On the basis of these results we are justified 
in concluding that psychological masculinity- 
femininity is not a unitary trait. It has at 
least two dimensions for both sexes, represent- 
ing emotional characteristics and interests 
respectively. There is another possible third 
dimension for females having to do with 
acceptance of a feminine social role. Possibly 
the inclusion of a wider variety of M-F tests 


in a factor analysis would identify still more 
factors. As a consequence, it would appear 
that one should scrutinize closely the pattern 
as well as the extent of M-F deviations 
represented by scores on any of our currently 
used tests in this area before making judgments 
about individual cases. Terman and Miles’ 
Exercise 4, Emotional and Ethical Aititudes, 
and Exercise 5, Interests, represent fairly well 
the two principal factors which this study 
has identified from the blank as a whole. 


Received July 30, 1951. 


Vocational Interests of Retired YMCA Secretaries * 


Wallace A. Verburg 7 


University of Kansas 


Strong’s studies (2) of vocational interests 
were limited primarily to students and em- 
ployed individuals under 60 years of age. 
This investigation is concerned with the 
vocational interests of retired YMCA general 
secretaries, some of whom were included in 
Strong’s criterion group in 1927. 

The general hypothesis of this research was 
that there would be no significant difference 
between the vocational interests of retired 
and employed members of an occupation. 


Procedures 


The Strong Vocational Interest Blank was 
mailed to 131 retired and 134 active YMCA 
general secretaries who lived in all parts of the 
United States. Retirement at 60 is customary 
in the YMCA, and is compulsory at 65. The 
return of usable Blanks was 66 per cent. No 
differences obtained between frequency of 
response from employed and retired secretaries, 
but in the retired group persons 66 years of 
age and below tended to respond more fre- 
quently than those above 66 years of age. 
The difference, tested by the chi square 
technique, was significant at the .02 level. 
Britton (1) noted no difference in frequency 
of response to an attitude inventory by retired 
YMCA secretaries above and below 70 years 
of age. 

Respondents in the retired group in this 
study had a mean age of 66, 14.7 years of 
education, an average of 37 years of service in 
the YMCA, and had been retired for an aver- 
age of 4.4 years. Respondents in the active 
group had an average age of 56 years, 15.6 
years of education, and 32.3 years of service. 
Generally, retired YMCA secretaries report 
themselves to be healthy, financially secure, 
and sociable. They were satisfied with their 
jobs before retirement and are satisfied in 

retirement. 
*The major part of an Ed.D. thesis done by the 


writer in 1951 (4) under the supervision of Dr. Donald 
E. Super, Teachers College, Columbia University. 


Blanks were machine scored for twelve 
occupational scales, six of which are “typical” 
of Strong’s groups, and three nonoccupational 
scales. Pretest (1927) scores for two sub- 
groups, 20 retired (Group RS) and 28 active 
(Group AS) YMCA secretaries were available 
from Strong’s criterion groups. These data 
permitted a check on change in interests over 
a 24-year period for these two groups. 


Results 


Results of the 1927 and 1951 tests for the 
two small subgroups (RS and AS) are presented 
in Table 1. 

No difference between mean scores for 
Groups AS and RS, on each of the fifteen 
scales in 1927, is significant at or beyond the 
.05 level. Neither is there any significant 
difference between the 1951 scores for these 
two groups. 

Two numerical differences between the 
1927 and 1951 scores for Group AS on the 
social science teacher and the public admin- 
istrator scales are significant beyond the .01 
level.! Group RS has smaller numerical but 
also significant changes on these two scales. 
These two differences are large enough to 
change the equivalent ratings for both Groups 
AS and RS from B’s to A’s. Other differences, 
although some are statistically significant, do 
not constitute meaningful changes. For in- 
stance, an increase from 13.1 to 16.1 in stand- 
ard score on the carpenter scale for Group AS, 
while significant at the .05 level, does not 
constitute a meaningful change in interest. 
Both scores are equivalent to C ratings which 
indicate a lack of interest in the occupation. 

It can be noted that generally the active 
secretaries have greater numerical changes iD 
mean scores than the retired secretaries- 
However, while the active group gained .8 of @ 
standard score on the accountant scale, the 
retired group gained 4 points in 24 years: 

1Since 1927 and 1951 scores are correlated, mea” 


changes in standard scores were computed and these 
changes were tested for significance. 


254 


Vocational Interests of Retired YMCA Secretaries 


255 


Table 1 


Interest Scores of Retired (RS) and Active (AS) Y. 


MCA Secretaries Tested in 1927 and Retested in 1951 


Mean Scores 


Standard Deviations 


1927" 1951 + 1927 1951 

: RS AS RS AS RS AS _RS AS 

Seale Gray OB 2 28 20 28 20 28 

¥ secy. 51.1 46.9 49.1 48.6 11.7 10.7 11.6 üi 
Y phys. 42.2 39.6 42.2 40.2 424 11.5 11.5 8.9 
S. sci. tch. 40.7 38.8 46.0 46.1 9.9 9.1 10.5 9.2 
Minister 42.2 37.2 424 41.2 11.6 119 11.5 12.5 
Pub. adm. 40.1 38.4 45.2 45.7 10.4 94 10.2" 9.3 
Life ins, 37.9 36.0 37.1 34.9 9.8 10.2 8.3 11.1 
Acct. 28.9 311 329 -319 15 5 9.8 9.5 
Lawyer 29.6 30.7 30.2 31.6 9.6 9.2 10.0 8.3 
Farmer 29.3 28.0 29.6 30.2 10.1 9.4 9.0 8.7 
Physician 25 214 18.7 19.6 61 10.8 6.1 7.9 
Chemist 142 16.2 13.0 148 10.4 123 10.0 13.2 
Carpenter ree Bi 18.5 16.1 12.2 11.0 12:8 od 
OL i 59.7 57.2 57.3 8.0 7.3 7.6 7.0 
Mr Ps : 40.9 41.2 41.2 9.3 8.7 8.1 7.2 
IM 60.9 59.4 6.7 4.5 


Gains for both groups are in the opposite 
direction from that anticipated (2, pp. 215- 
276). No change, however, for the retired 
group can be explained as an effect of retire- 
ment. This conclusion is supported further 
when 1951 scores of larger groups of retired 
(Group R) and active (Group A) secretaries 
are compared in Table 2. 
The similarity of mean ue 
groups js striking. Also, mean scores : 
these persons, who were not members o 
Strong’s criterion group, are very similar to 
those in Table 1. There are no significant 
ifferences between 1951 scores on any scale 
Or these two groups and their scores are not 
Significantly different from comparable e 
of the smaller groups in Table 1. When 
Standard scores of the two retired groups 
(R and RS) are combined and scores of me 
two active groups (A and AS) are also poole i 
the greatest difference between combine 
active and retired groups on the minister 
Scale is significant at the 05 level only. 
Retired persons scored 3.6 higher on the 
minister scale which may be partially ac- 
Counted for by the fact that individuals with 
interests similar to those of ministers tend to 
Score higher on the minister scale with increase 


scores for. these 


age (2, pp. 274-276). Also, the retired 
secretaries are older and served in a more 
evangelical Y. This difference cannot be 


Table 2 


Interest Scores of Retired (Group R) and Active 
(Group A) YMCA Secretaries Tested 


in 1951 

Mean Standard 
Scores Deviations 

Group: R A R A 

Scale N: 61 65 61 65 
Y secretary 48.3 47.0 10.0 10.4 
Y physical director 41.5 41.8 10.3 11.8 
Social science teacher 44.3 44.8 10.4 10.4 
Minister 42.0 37.3 12.5 107 
Public administrator 41.3 43.7 9.3 10.0 
Lifeinsurance salesman 38.0 38.8 9.5 9.6 
Accountant 32.5 29.7 9.7" 107. 
Lawyer 29.6 29.6 sS Wi: 
Farmer 29.0 29.3 9.7 91 
Physician 215 20.3 9.5 9.4 
Chemist 133 13.2 11.0 11.3 
Carpenter 17.5 16.2 13.5 12.6 
OL 56.6 57.4 TE ved 
MF 39.3 41.7 83 ` 81 
IM 59.7 58.9 6.2 6.0 


256 


interpreted as being very meaningful since 
Group AS has a rating which is equivalent 
to a high Band Group RS has a low B+ rating. 

Retired YMCA secretaries were classified 
according to the similarity to YMCA work of 
jobs they held in retirement. The. group 
(N=28) that was employed in jobs dissimilar 
to their previous occupation could not be 
differentiated on the YMCA secretary scale 
from the group (NW =38) that continued in jobs 
after retirement which were similar to YMCA 
work, It was possible to score only eleven 
Blanks for dissimilar occupations since scales 
for all of the 28 dissimilar occupations were 
not available. These eleven individuals had 
an average mean score equivalent to a B+ 
rating on their dissimilar occupations, Their 
mean YMCA secretary scale score was equiv- 
alent to an A rating. Apparently, vocational 
interest scores for YMCA secretaries are 
maintained even though individuals may 
pursue an occupation in retirement which is 
rated as being dissimilar to their former work. 

The supposition that retired persons would 
score lower than active secretaries on the 
Occupational Level, Interest Maturity and 
Masculinity-Femininity scales was not borne 
out by this study. Strong (2) and Terman 
and Miles (3) have reported that men tend to 
become more feminine in interests with 
advanced age. There were no significant 
differences between M-F scores for the com- 
bined groups of retired and active YMCA 
secretaries in this study but when the retired 
and active group M-F scores were compared 
with those of groups of younger men (2, p. 231), 
they were significantly lower (more feminine). 
Mean scores for the active YMCA secretaries 
were also significantly lower (.05 level) and 
retired secretaries lower (.01 level) than for a 
group of 317 men between the ages of 50 to 
59 from Strong’s mixed criterion groups. 
However, 61 individuals from Strong’s men-in- 
general group between the ages of 50 to 59 
scored significantly lower (.0001 level) than 


both the active and retired groups of YMCA 
secretaries in this study. 


Summary 


Vocational interests of 131 retired and 134 
active YMCA general secretaries were meas- 


Wallace .A 


. Verburg 


ured by the Strong Vocational Interest Blank. 
No significant differences obtained between 
1951 mean standard scores on 15 interest 
scales for retired and active groups. A 
tendency for both employed and retired 
YMCA secretaries to score slightly higher over a 
24-year period on the social science teacher and 
public administrator scale was noted. Retired 
secretaries had changes in scores on most scales 
which were numerically less than those for 
employed secretaries but their 1951 scores were 
very similar to those of employed secretaries. 
The observation that M-F scores for groups of 
older men, both active and retired, tend to be 
more feminine than those for younger men 
supports previous findings. 

1. Vocational interest scores, as indicated 
by 12 occupational scales, for a group of 
YMCA general secretaries are maintained in 
reitrement at least beyond the age of 70 years. 

2. Vocational interest scores for groups of 
YMCA general secretaries, both retired and 
active, remain relatively constant over & 
24-year period. 

3. Occupational Level, Interest Maturity, 
and Masculinity-Femininity scores apparently 
are not affected by retirement. M-F scores 
for groups of retired and active secretaries 
are lower than M-F scores of younger men; 
however, YMCA secretaries in later maturity 
have more masculine interests than men-in- 
general of comparable age. 

4. Younger retired YMCA secretaries tend 
to return a questionnaire similar to the 
Strong Blank more frequently than olde 
retired secretaries. 


Received August 1, 1951, 


References 


1. Britton, Jean O. A study of the adjustment of retired 
YMCA secretaries. Unpublished Ph.D. diss 
tation, Univ. of Chicago, 1949. 

2. Strong, E. K., Jr. Vocational interests of men and 
women. Stanford: Stanford Univ. Press, 19%; 

3. Terman, L. M., and Miles, Catherine C. Sex @” 
personality. New York: McGraw-Hill, 1936- i 

4. Verburg, W. A. A study of the effects of retirent 
on the interests of YMCA secretaries as indict É 
by the Vocational Interest Blank. Unpublish 
Ed.D. project, Teachers College, Columb? 
Univ., 1951. 


Predicting Success in Law School * 


Ralph F. Berdie and Wilbur L. Layton 


Professional schools are among the most 
frequent users of prediction indices for purposes 
of selection, When facilities for training are 
limited and laboratories and equipment ex- 
Pensive, only those students who show the 
greatest probability of succeeding are wanted. 

The prevailing policies in many law schools, 
however, allow any student to enter who has 
completed the minimum number of years, 
two, three, or four, of pre-legal work. Selec- 
tion then takes place, not prior to admission, 
but during the first years of legal training, 
and those who fail have the compensation of 
knowing they had a chance. But those who 
fail also run the risk of a serious traumatic 
experience to say nothing of the financial 
Wastes involved. 

Predictive tests have proved to 
value to counselors and instructors working 
With law students, both those who fail and 
those who do not. A student whose scores 
indicate he has ability to do well, but who 
Teceives poor grades during bis first year, can 
be encouraged and assisted to develop proper 
Study methods or to discover other reasons 
underlying his poor work. A failing student 
With low test scores can be encouraged n 
re-evaluate his capacities for this training io 
to consider alternative training programs an 
Occupations, In this situation, scholastic pre- 

iction does not cease when @ student is 
admitted, It must continue as long is 
Student is to have counseling and personalized 
'nstruction, 


be of much 


Purpose 
s to determine 


T i wa 
he purpose of this study S a amitie 


the effectiveness of predictive : 
or freshmen in law school at the end of their 
irst quarter in residence. The predictive 
data were based on tests, pre-legal grades, and 
à Score on a “practice examination” given by 


E. 
. Appreciation is expressed to Dean Maynard 
Pirsig and Professors Robert C. McClure and Stanley jA 
yvon of the School of Law at the Universi y 
thienesota for their interest in and cooperation W 
18 project, 


* 


Student Counseling Bureau, University of Minnesota 


the law school at the end of the first quarter. 
The criterion was the over-all grade of the 
student at the end of his freshman year in 
law school, based on grades in his four fresh- 
man-year courses. 


Method 


The 207 freshmen entering the law school of 
the University of Minnesota in the fall of 
1947 were given a battery of tests during the 
first week of the quarter. These tests included 
the Miller Analogies Test, Form G, and the 
Towa Legal Aptitude Test, consisting of seven 
parts—analogies, reasoning, opposites, rele- 
vancy, mixed relations, memory, and legal 
information. A total score was also deter- 
mined. Other data available for these fresh- 
men included scores on the ACE Psychol. 
Exam., 1937 form, taken for the most part 
during the senior year in high school; high 
school percentile rank; pre-legal honor point 
ratio; and grades based on the first quarter’s 
work. These latter grades are based on 
comprehensive and lengthy essay type exam- 
inations. Students are not dropped from 
school on the basis of grades on these tests since 
these first quarter examinations are regarded 
as exploratory practice. Grades on these 
first quarter examinations had a correlation 
of .56 with first year grades, based on the 
comprehensive examinations covering the work 
of the entire freshman year." 

At the end of the first quarter, correlations 
were computed between test scores and 
grades for that quarter. At the end of the 
year, zero order and multiple correlations were 
computed between test scores, first quarter 
grades, and first year grades. 

The multiple regression equations deter- 
mined on the group of 1947 freshmen were 
used to predict first year grades for the fresh- 

1 Although the predicted first year grades are avail- 
able to the students and faculty at the close of the first 
quarter, the final examinations are graded anonymously 


at the end of the freshman year and the authors are 
convinced the criterion is not contaminated. 


257 


258 


men who entered in the falls of 1948 and 1949. 
These predictions were also made at the end 
of the fall quarter in each case. The predicted 
grades were compared with the obtained grades 
to determine the “shrinkage.” 


Results 


The zero order correlations between the 
predictive data and first quarter grades and 
first year grades are presented in Table 1. 

The regression equation for predicting first 
quarter grades is: 


Ý, = 6.3510X» + .1195X 3 + 52.2233 
and the multiple R = .49, 
where 


Y, = Fall quarter grades 
X = Pre-law honor point ratio 
X; = Legal information 


The regression equation for predicting first 
year grades is: 


Yo. = .3166X, + .0542X2 + 3.5544N; 
— 3848X,-+ 4738X5 + .0399X, 
— .2183X;7 + 38.1640, 


Table 1 


Correlations Between Test Scores and Grades at the 
End of the First Quarter and at the End of 
the First Year for Law Fresh- 
men Entering in 1947 


Correlation Correlation 


with ist with 1st 
; Quarter Year 

Predictor Grades Grades 

= J = * 

Miller Analogies Test, a Ha 
Form G .22 40 
ACE Psychol. Exam. 23 49 
High school percentile rank 23 39 
Pre-law honor point ratio AS ‘55 

Towa Legal Aptitude Test— l 

Total 7 
Analogies Ei L 
Reasoning 17 38 
Opposites 24 we 
Relevancy 21 33 
Mixed relations .26 30 
Memory 14 16 
Legal information 27 27 
First quarter grades = 56 


* Complete data were available on 99 of the 207 
freshmen. g 


Ralph F. Berdie and Wilbur L. Layton 


where 
F, = First year grades 
X, = First quarter grades 
X = A.C.E. (percentile rank) 
X; = Pre-law honor point ratio 
X, = Mixed relations 
Xs = Analogies (Iowa) 


Xs = High school rank 
X: = Memory 


The multiple R, using these seven variables 
to predict freshman grades, was .75. Combin- 
ing the variables to predict first year 
grades increased the correlation as the multiple 
R of .75 is higher than the highest zero order 
correlation of .56 between first year grades and 
first quarter grades. In predicting first quarter 
grades, however, combining the variables, 
which resulted in a multiple R of .49 did not 
provide significantly better prediction than 
was obtained by using pre-law honor point 
ratio alone, where the zero order correlation 
was .48. 

Using Hull’s method of expanding weights 
to allow the range of predicted grades tO 
approach the range of obtained grades, the 
equation for predicting first quarter grades 
becomes: 


Y, = 12.95Xy + .2438X; + 44.18 


and the equation for predicting first year grades 
becomes: 


Yo = 4211X, + .0721X + 4.7274X3 
— «5118X; + .6302X5 + .0531X6 0 
— 2903X: + 2687 


+ . j e 
For several students, high school percentile 


ranks, pre-legal grades and scores on the 
Psychol. Exam. were not available 8g 4 
prediction equation based on four predicto" 
was developed. 


Yo — .6246X, — .4197X4+ .7385X5 44 

+ .3116X—+ 13°" 

e 

where Xs = reasoning (Iowa test). p 
multiple R provided by this combination 

.68. 5 

In the falls of 1948 and 1949 the two ren 

man classes entering those years were g! 

the tests found useful in predicting grades fal! 

the 1947 class and at the close of the ~ 


Predicting Success in Law School 


Table 2 


Correlations Between Obtained First Year Grades and 
Grades Predicted on the Basis of Four and Seven 
Variable Regression Equations for the 
Freshmen, 1947-1949 


Class R Based on R Based on 
Entering 4 Variable vi Variable 
Year of: Prediction Prediction 

1947 68 15 

1948 .67 59 

1949 79 73 


= a 


quarters, first year grades were predicted, 
using both the seven and the four variable 
Prediction equations presented earlier, those 
with the expanded weights. The obtained 
first year grades were then correlated with 
the predicted grades. These correlations are 
shown in Table 2. 7 
For the 1948 class, the zero-order correlation 
between obtained grades and predicted grades, 
using the seven variable equation, was 59, 
and using the four variable equation, .67. 
Similar correlations for the 1949 group were 
73 and .79. ‘These are to be compared to the 
originally obtained multiple R’s of 75 and .68. 
he means and standard deviations for these 
three years are found in Table 3. ao 
Table 4 presents the distribution of prea 
Brades as compared to the obtained grades for 


i} 
an 
oO 


Table 4 


Per Cent of Students Each Year with Predicted and 
Obtained Grades, Predictions Being Based 
on Four Variable Equation 


Predicted Grades 


Obtained <a een aes 
Grades Year 50-59 60-69 70-79 80-89 Total 
1949 7 11 18 
80-89 1948 1 7 6 14 
1947 d. 13 5 19 
1949 1 11 30 6 48 
70-79 1948 3 22 28 3 56 
1947 1 13 37 1 52 
1949 4 17 6 27 
60-69 1948 7 13 6 26 
1947 4 19 4 1 28 
1949 2 4 1 7 
50-59 1948 2 3 5 
1947 2 1 3 
Totals 1949 7 32 44 17 
1948 12 39 41 9 
1947 7 34 4 7 


each of the three years. Of the freshmen 
entering in 1947, 1 per cent had predicted 
grades of 60 to 69 and actually obtained grades 
of 80-89, whereas of this total class, 5 per cent 
had predicted grades of 80-89 and actually 


Table 3 


5 iati f Test Scores and Grades for the Law Freshman Classes 
Means and Standard Deviations z 1947, 1948, and 1949 


1947 
Group A Group B 1948 1949 
Da Sages Se 

Variabli Mean SD N* Mean SD N Mean SD N Mean SD N 

ariable 7 
First 72.6 7.0 99 724 71 163 73:2 73 250 71.4 137 10 
irst year grades Ea 74 9 69.3 6.2 122 65.0 9.5 286 67.6 7.5 127 
quarter grades 1 gg 99 602 266 151 60.2 25.7 211 562 25.6 123 
SE (percentile rank) p 2 ged y S ni 16 5 268 17 Sago 
MANY Poi pe E8 Ti 9 15.6. 7.2 207 ma 7.3 255 14.5: 7.0 145 
hae Seiad? 215 64 99 21.2 67 207 20.3 6.7 254 20.5 63 145 
alos z owa) a g2 99 iTA SL, 20 17.1 33 255 183 34 145 
eyes Hé 64 99 20.6 6.6 207 20.2 6.3 255 211 65 144 
igh EN wa 753 23.2 99 69.0 24.2 146 65.8 23.8 203 67:6. 229 123 

ank Ş j 


* 3 
For the 1947 class, group A is the group 
B consists of the total group for whom a particular 
fey for the same reason. The differences ie is 
or the differences in the correlations between predic 


on which the original regression equations were based and group 
rop measure was available. The N’s for the 1948 and 1949 groups 
these groups are so slight that they probably do not account 
d and obtained grades for the three years. 


Ralph F. Berdie and Wilbur L. Laylon 


Table 5 


Percentage of Students Each Year with Various Predicted Grades Who Failed 


Year 
Predicted Grade 1947 1948 1949 
Percentage with predicted grade below 60 who actually failed 90 76 87 
Percentage with predicted grade of 60 or above who actually failed 26 30 28 
Percentage with predicted grade of 70 or above who actually failed 7 12 12 
Percentage with predicted grade of 80 or above who actually failed 0 5 0 


obtained those grades. For that year, 26 
per cent had predicted grades of below 70 
and actually obtained grades of below 70, 
while 15 per cent of that class had predicted 
grades of below 70 but actually did better 
than that. Table 5 summarizes some of the 
information presented in Table 4. 

Thus, of 46 students predicted to have 
grades of 80 or more, only one failed, when the 
lowest passing grade is 70. Of 43 students 
predicted to have grades of less than 60, 
eight passed. 


Summary 


The predictions studied here differ from 
those previously reported? insofar as a progress 
*Stuitt,D.B., etal. Predicting success in professional 


schools. Washington, D. C.: American Council on 
Education, 1949, 


measure of success at the end of the first 
third of the freshman year is used as one of the 
predictors of academic standing at the close 
of that year. Pre-legal honor point ratio was 
found to be the best single predictor of fresh- 
man grades here (with the exception of trial 
examination grades), as has been reported 
consistently in other studies. High school 
percentile rank, the ACE Psychol. Exam., and 
the Iowa Legal Aptitude Test also contributed 
to effective prediction. 

The multiple correlations reported here are 
not significantly different from others found at 
Minnesota and at other places attempting 
to predict grades in law school. The absence 
of any significant “shrinkage” in predictive 
efficiency over the three year period is encou" 
aging for those attempting such predictions. 


Received August 10, 1951. 


The ACE Psychological Examination and High School Standing as 
Predictors of College Success 


Norman Frederiksen 


Princeton University and Educational Testing Service 


and 


W. B. Schrader 


Educational Testing Service 


A study of the college adjustment of veteran 
and nonveteran students has recently been 
completed by the Educational Testing Service, 
with the support of the Carnegie Corporation 
of New York. The design of this _study 
involved the calculation of validity coefficients 
for high school standing and the ACE Psycho- 


logical Examination (the ACPE) for freshman 


students in a number of colleges and pat 
sities, The present article summarizes th 


findings of these validity studies. 


The Groups Studied 


The colleges and universities which tl 
ipated in the study were quite varied. i hey 
were so chosen as to include private co eges, 
state universities, and municipal institutions; 
coeducational and men’s colleges; large univer- 
sities and relatively small colleges; ca 
with great financial resources and less W any 
colleges; and colleges located in large ¢ s 
and those located in small towns- corm = 
cally they were distributed throughout “A 
Country, but with some concentration i ts 
east and midwest. Sixteen a + 
included in the entire investigation, an ten 
for twelve of these colleges will be repor 


here 
Son ised to ensure that the 
e care was exercise T h 


Stoup of students represented in any paii 
validity study were reasonably homogeneons. 
Specifically, the members of such a group on 
all male students, all attended the sa E 
institution, all were enrolled in the sam 
college or division of the institution (egs 
liberal arts or engineering), and all had entere! 

the institution in the fall of 1946. wich 
each group, veteran and nonveteran stu i 
Were treated separately. Since, ™ inet 

the twelve colleges considered, separate stud ies 


were made of students enrolled in different 
divisions, a total of fifteen groups, each contain- 
ing both veteran and nonveteran students, 
were investigated. In each group the validity 
of the ACE Psychological Examination or of 
some measure of high school standing, or of 
both, was studied. In each case, the criterion 
was the first-year college average grade. 


Results 


The validity coefficients are presented in 
Table 1; the corresponding means and standard 
deviations are shown in Table 2. (In general, 
the means and standard deviations are 
comparable only for veterans and nonveterans 
at the same institution, since the scales are not 
necessarily the same at different institutions. 
Raw scores on the ACPE were based on 
various forms of the test, and in some instances 
percentile scores were converted to standard 
scores with a mean of 13 and a standard 
deviation of 4. High school standing was 
reported on a variety of scales.) 

The validity coefficients for the ACPE 
showed considerable fluctuation from one sub- 
group to another; much of this variability is 
presumably ascribable to sampling error, 
particularly since some of the subgroups are 
rather small. The median correlation for the 
twenty-two subgroups of male students where 
ACPE total score was used is .47. The Thur- 
stones? observed, in 1932, that the correlations 

_1 For the eleven veteran subgroups, the validity coeffi- 
cients range from .35 to .60, but it was found that the 
variation among the coefficients was no greater than 
would be expected to arise by chance in groups of this 
size. For the eleven nonveteran subgroups, the range 
is from :28 to .61; in this case, there are less than 2 
chances in 100 that as great or greater variability would 
arise by chance. 

2 Thurstone, L. L., and Thurstone, Thelma G. The 


ay aioe Examination. Educ. Rec., 1932, 13, 


261 


262 


Norman Frederiksen and W. B. Schrader 


Table 1 
Correlations of ACPE Scores and Measures of High School Standing with First-Year College Average Grade 


Validity Coefficients 


a g Measure of N 
Description of e ee H. S. siding IV" MN“ 
College Division College MV* MN’ MV* MN' Use MV 

4 rivate, east — — .55 60 Adjusted Rank 694 531 

a R TR ai — — 53 62  AdjustedRank 536 570 
G Arts Coed., private, east AL .39 43 58 Rank 111 112 
D Arts Coed., private, south 35. 355 s2 .65 Average Grade 77 119 
E Arts Coed., private, midwest .— — .49 65 Rank 164 175 
F Arts Coed., private, midwest .51 45 - — 283 94 
G Arts Coed., public, midwest - — 61 61 Average Grade 466 166 
H Arts Coed., public, midwest 49 34 — — 111115 
I Arts Coed., public, midwest .57 .29 — — 83 12 
J Arts Coed., public, west .46 «SL 53.59 Average Grade 433 222 
K Engr. Coed., public, midwest .60 .61 .62 53 Average Grade 71 128 
Ty Engr. Coed., public, midwest A2 .28 iol 498 Rank 352 98 
I Engr. Coed., public, midwest AS 36 — — 167 171 
K Agric. Coed., public, midwest .52 48 .56 68 Average Grade 140 102 
H Bus. Coed., public, midwest SS ST - — 142 65 

Median Validity Coefficient 49 AS .53 60 
* Male veterans. 
** Male nonveterans. 
between ACPE scores and college grades .57.* High school record thus furnishes & 


averaged around .50, which is in good agree- 
ment with the present findings. 

There is a slight tendency for the test to 
yield higher correlations with grades of 
veterans than with grades of nonveterans; 
the median validity coefficients are .49 and 
AS. respectively? The slight tendency for 
validity coefficients to be higher for veterans is 
consistent with the hypothesis that veterans 
are more uniformly motivated than non- 
veterans and thus tend more than nonveterans 
to achieve the grades they are capable of 
earning. 

The correlations of high school standing with 
first-year grades varied from .43 to .68 in the 
twenty subgroups where it was used as a 
predictor; the median validity coefficient is 

8 When validity coefficients for veterans and non- 
veterans are compared in the eleven groups, the coeffi- 
cient is higher for the veterans in seven of the eleven 
comparisons. In only one instance, however, is the 
difference significant; the: coefficient for veterans is 


significantly higher at the 5 per cent level for 


i the grow 
of engineers at University L. group 


somewhat more accurate prediction, for these 
groups, than the ACPE. While the ACPE 
tended slightly to be more valid for veterans 
than for nonveterans, the opposite tendency 
was found for high school standing; the median 
validity coefficients were about .53 and .60 
for veteran and nonveteran males respectively: 
Out of ten comparisons, the validity of high 
school standing was greater for nonveterans 
in eight instances, with one tie. This differ- 
ence seems reasonable in view of the greater 
amount of time elapsing between high scho? 
and college for the veteran students. Hig 

school grades presumably reflect motivation® 
and other nonintellectual factors as well 4° 
ability to do academic work; to the extent 


4 For the ten subgroups of veterans and for the te 
subgroups of nonveterans, separately, it was found wa 
the variability in the correlation coefficients di Pi 
exceed the amount to be expected by chance. wh th 
validity coefficients for veterans were compared whe 
those for nonveterans in each of the ten groups sep 
rately, the coefficients for the nonveterans were touna 
to be significantly higher (5 per cent level) for studer 
in the liberal arts division in two universities (B and B 


ACE Examination and High 


School Standing as Predictors 


Table 2 


Means and Standard Deviations of Predictor and Criterion Variables 


First-Year Avg. 


ACPE H. S. Standing College Grade 
College Division Subgroup Mean SD Man 2 ass 33 
T ez a 
bo me MW Z I ib m sr E 
c m m ona sE Sa 
Do me W ——_— cn a e 
s a m Pes Bos 
roo om MY AL ie ae 2 205 ae) 
g awk AE 2a st 21966 
Hows OMY ae = as cc 2 
oOo omo W BOR = 31880 
Tate MY MDa 283s 243 s6 
Ko me āă W oS a O o 
DOO me W ip} a 8 i ce 
K Agic EN ie 38 = Ue ft a 
* Male veterans. 
** Male nonveterans. 
that these characteristics have mined el ee ce service were not 


for some veterans than for others, 
tion would be lowered. To the extent va 
high school grades reflect knowledge and skills 
directly useful in college work, their predictive 
effectiveness would be lowered also, since the 


The universities which supplied information 
on high school standing used various measures 
of relative performance. The three types of 
measures employed have been designated as 


264 


average grade, rank in class, and adjusted 
rank in class. Of these, average grade suffers 
from the disadvantage that the various second- 
ary schools use marking systems which differ 
markedly in form. Rank in class overcomes 
this difficulty, and is presumably preferable 
to average grade; but the secondary schools 
may differ greatly in the calibre of their 
students, so that being, for example, 75th in 
a class of 100 does not mean the same thing in 
all schools. The adjusted rank in class 
provides a system of corrections to overcome 
this difficulty, based on past experience with 
the graduates of the various schools. 

The type of measure of high school standing 
employed at each institution is indicated in 
Table 1. The correlations shown do not 
indicate any clear superiority for any particular 
type. It must be pointed out, however, that 
the colleges vary in their use of high school 
standing in selection of students. Some of 
the colleges admit all graduates of approved 
high schools, while others do not. Students 
with poor high school records were not ad- 
mitted in any of the five groups where rank or 
adjusted rank was used. The curtailment in 
range produced by use of the measure in 
selection obviously will affect the size of the 


Table 3 


Multiple Correlations of ACPE Total Score and 
High School Standing with First-Year 
College Average Grade 


College 


- Division Subgroup R 

le (Arts) Male veterans 3 
Male nonveterans .62 

D (Arts) Male veterans 59 
Male nonveterans 74 

J (Arts) Male veterans .60 
Male nonveterans -65 

K (Engr.) Male veterans 72 
Male nonveterans 76 

L (Engr.) Male veterans 56 
Male nonveterans 53 

K (Agric.) Male veterans .65 
Male nonveterans -70 

Median Male veterans .60 
Male nonveterans -68 


Norman Frederiksen and W. B. Schrader 


correlation obtained; the relative merits of the 
different measures of high school standing 
therefore cannot be appraised from these 
findings. It may be concluded only that all 
the colleges utilizing high school record are 
employing it in a form which is quite effective 
as a predictor of freshman average grade for 
its own students. 

In six of the groups, validity data are 
available for both the ACPE and high school 
standing, thus affording twelve direct compari- 
sons of the two predictors. For veterans in 
these six groups, the median validity of the 
test is 44 and of high school standing about 
.52; the comparable medians for nonveterans 
are about .50 and .58. In eleven of the twelve 
comparisons, the validity coefficient was higher 
for high school standing; such consistency 
in the direction of the difference would be 
expected to occur by chance fewer than once 
in a hundred times.’ The finding of a higher 
validity coefficient for high school standing 
than for scores on a single aptitude test 
agrees with the typical findings of previous 
comparisons of this kind. 

It is, however, more proper to think of the 
test and high school record as joint members 
of a predictive team than as competitors. 
The effectiveness of the team is shown in 
Table 3, which presents the multiple correla- 
tions of ACPE total score and high school 
record with first-year average grade for the 
six institutions employing both measures. 
For veterans, the R’s range from .53 to 1 
with a median of about .60; for nonveteran 
the range is .53 to .76 with a median at about 
68. The use of the two predictors in combina 
tion thus furnishes a better basis for predictio” 
than either measure used alone. The mag”! 
tude of the correlation is great enough tO ue 
dicate that the combined measures provide e 
really useful prediction of how well a particular 
student is likely to succeed in his freshma? 
year. 

5 When the differences between validity coefticien! 
were tested for statistical significance, high sche f 
standing had a significantly higher validity 1" is 
subgroups. One difference was significant at t: 
cent level (nonveteran engineers at University Jar 
three were significant at the 5 per cent level (vetera? 


engineers at L, nonveterans in arts at C and in agit 
culture at K). 


ACE Examination and High School Standing as Predictors 26 


Summary 


1. The median correlation of the ACE 
Psychological Examination with first-year 
college grades was .47; the median correlation 
of high school standing with the same criterion 
was .57. 

2. The validity of the ACPE tends to be 
slightly greater for male veterans than for 
male nonveterans, while the validity of high 
school standing tends to be greater for non- 
veterans. The median validity coefficients 


an 


for the ACPE, based on veteran and non- 
veteran students, were .49 and .45 respectively. 
The comparable median validity coefficients for 
high school standing were .53 and .60. 

3. The use of a weighted composite of ACPE 
total score and high school standing provides 
a useful prediction of freshman average grade; 
the median multiple correlation coefficient for 
veteran students was found to be .60 and for 
nonveterans .68. 


Received August 6, 1951. 


Education and Prediction of Military School Success $ 


Virginia Zachert and Abraham S. Levine 


USAF Training Command, Human Resources Research Center, Personnel Research Laboratory, 
Lackland Air Force Base, San Antonio, Texas 


An attempt was made to explore the 

possibilities of using years of formal education 
as an additional predictive variable in various 
Air Force training programs. The procedure 
involved comparing the validities of various 
Air Force classification test composites with 
the validities resulting from using amount of 
education as a predictor. For the purposes of 
this study, none of the obtained validity 
coefficients were corrected for restriction of 
range due to initial screening on the basis of 
either aptitude test scores or education. In 
most cases, the samples were screened on the 
basis of aptitude test variables, but since 
years of education and aptitude test perform- 
ance are substantially correlated, it must be 
inferred that the populations sampled in this 
study were more homogeneous than the 
general population in both the aptitude and 
educational dimensions. For the most part, 
the enlisted airmen in this study were recruited 
from the upper 60% of the population, while 
the aircrew trainees were more representative 
of the college student population. 

All of the tests utilized in this investigation 
are included in the Airman Classification 
Battery or Aircrew Classification Battery, 
which were developed by aviation psychol- 
ogists during or following World War II. The 
educational attainment data were obtained 
from forms completed by the basic airmen. 
In each comparison the total number of 
available cases was used. 


World War II Validities 


A survey of aircrew studies during World 
War II (2) showed that the validity for 
educational level was always below -that for 
the appropriate stanine.? Foran experimental 


1 The views expressed in this article are those of the 
authors and do not necessarily represent the official 
views of the United States Air Force. 

2 By appropriate stanine is meant the one designed 
to measure success 1n the particular type of training 
referred to; thus, the appropriate stanine for pilot 
trainees is the Pilot Stanine. A stanine is a composite 
score expressed in nine-point standard score units with 


group of 1,300 men admitted to pilot training 
without any requirements as to aptitude or 
personality, educational level had a validity of 
.20 as compared to the validity of .64 for the 
Pilot Stanine. The criterion measure em- 
ployed was graduation vs. elimination in 
Primary Flying School. However, for a 
group initially screened with the Army Air 
Forces Qualifying Examination, the validity 
of the Pilot Stanine was .55 and education 
dropped to .02. Apparently, virtually all of 
the valid variance in the education predictor 
was accounted for by the screening test. For 
a sample of nearly 2,000 advanced trainees, 
selected on the Navigator Stanine, the validity 
for education was .11 while the Navigator 
Stanine validity was between .50 and .60. 
For bombardier trainees, the validity of the 
Bombardier Stanine ranged from .10 to .30, 
and the education validity was about .05. 


Postwar Validities (Aircrew) 


Since October 1947, applicants for pilot 
training were required to present a minimum 
of two years of college education in addition 
to passing a screening examination. Thus, 
with respect to educational level the postwat 
pilot trainees represented a somewhat more 
restricted group than the World War II cadets, 
For all 1949 basic pilot classes (total N 
= 1596), the biserial validity of the pilot 
Stanine is 54, and the educational lev® 
validity is .04 (1). These validities are 
consistent with those obtained in World W4* 
II studies on screened groups. 


Postwar Studies (Airmen) 


Table 1 presents validities obtained for 
the first form of the Airman Classificati?" 
Battery and comparative validities obtai™® 
for years of education (3), The table also 
shows the respective contributions of educatio” 


t 
a mean of 5 and a standard deviation of 2. The loget 
possible stanine is 1 and the highest is 9. 


266 


Education and Prediction of Military School Success 267 
Table 1 : 
Comparison of Airman Classification Battery (AC-1A) and Education for Prediction of 
o Final Technical School Grades 
Multiple R* 
of Tests Education Ed. Adds 

School N in ACB Validity to R 
Clerk-Typist 1199-4715 68 58 05 
Stenographer 57-197 4 42 02 
Aircraft Welder 43-145 7 -28 .00 
Sheet Metal Worker 168-496 - 62 31 .00 
Carpenter 222-881 49 29 01 
Electrician 116-412 .61 .36 .00 
Fabric and Dope Mechanic 49-107 44 35 .02 
Machine saz 
Parachute Rigger 35- 57 35 .03 
Plumber “5 48-228 49 -22 .00 
Sheet Metal Worker te es 44 04 
Radar Mechanic ae R .39 00 
Radio Mechanic EFS i -61 .06 
Remote Control Turret Mechanic kor F -30 .00 
Construction Equipment Operator E z í a .12 00 
Firefighter and Crash Rescueman rate = oe 00 
Airplane Electrical Mechanic > -25 -00 
243-269 .62 —_"* 06 

A & E Mechanic 722-1994 79 0 "l 
A & E Mechanic (Conventional) e E 01 
N i 538-1641 64 38 00 
& E Mechanic (Jet) 53-123 7 40 ae 
Airplane Propeller Mechanic | 87-168 "2 a ey 

Airplane Instrument Mechanic Sete a Z “l 
Auto Mechanic Technician i à -00 
Olei Memon 44-235 33 06 a 
Eo ea 86-319 «54 —.04 a 
Semen, Operating ea 255-282 ‘60 95 m 
rimary Armament Techni 68-740 58 A 6 
Control Tower Operator 92-101 45 £4 a 
Radio Operator AACS nail) 141-243 “58 36 T 
SEE Operator (High Speed Mar 121-433 54 30 00 
aaa 52-311 61 35 o 
p'ographer 190-559 65 45 mi 


Weather Observer 


i d on 
* Th i lidity coefficients are base 
tests are o A ithe R's reported above should 
** For N of 23, Educ. Valid. equals .64. 


to the multiple R of the battery for predicting 


nal grades in various Air Force technical 
Schools. Validation data on these 32 technical 
Schools indicate that the ability tests measure 
Nearly all the valid variance to be found in the 
educational level variable as well as much 
Variance not accounted for by years of educa- 
tion, The average increase in the multiple 
Correlation when education is added is 012, 
and the most it adds is .06. Although 1m 
many cases the validity for educational level 


the best combination of tests in the battery. When the 
show some shrinkage. 


is above .50, the multiple R for the tests is 
at least .15 higher. 


Summary 


1. Years of education apparently adds little 
to the predictive efficiency of the Aircrew 
Classification Battery for various categories 
of aircrew training. Neither does it appreci- 
ably improve multiple prediction of most 
airmen technical school criteria. 


268 Virginia Zacherl and Abraham S. Levine 


2. Educational level does appear to con- 
tribute effectively to the prediction of a few 
airmen technical school criteria, e. g., Clerk- 
Typist, Radio Operator. Cross-validity data 
should be obtained on these schools in order 
to verify these findings. If they are confirmed, 
it might prove feasible to weight amount of 
education in terms of its contribution to a 
valid battery of tests. 


Received August 8, 1951. 


References 


1. Dailey, J. T., and Gragg, D. B. Postwar research 
on the classification of aircrew. Research Bulle- 
tin 49-2, Directorate of Personnel Research, 
Human Resources Research Center, Lackland 
Air Force Base, November 1949. 

2. Dubois, P. H. The classification program. Army 
Air Forces Aviation Psychology Program Re- 
search Reports, No. 2. Washington: 1947. 

3. Zachert, Virginia. Comparison of aptilude tests and 
years of education as predictors of military success. 
Research Note 51-9, Directorate of Personnel 
Research, Human Resources Research Center, 
Lackland Air Force Base, April 1951. 


The Use of Levers in Making Settings on a Linear Scale * 


William Leroy Jenkins and Merritt W. Olson 
Lehigh University 


Various factors in the use of knobs for 
making settings on a linear scale have been 
investigated in three previous studies (1, 2, 3). 
The most significant factor turns out to be 
the ratio between the movement of the pointer 
along the scale and the rotation of the knob. 
For relatively fine tolerances (.007 in. and 
thereabouts), the optimal ratio is one oF 
two inches of pointer movement for one 
Complete turn of the knob. 

With much coarser tolerances, it would seem 
Possible that a lever might permit more 
rapid settings than a knob. The present 
study is concerned with a comparison between 
levers and knobs and an attempt to define the 
Significant factors in the use of levers. 


Apparatus and Procedure 


y en- 
The apparatus and procedures were ess 


tially the same as those described for a 
Previous studies with knobs (1, 2, 3). Th 


i Er i re 
Subject matches the position of a lighted inser 


ina akel with a pointer con- 
black bakelite scale ng in a plane 


trolled by a knob or lever turnir I 
Parallel fa i sen of the pont 
ome minor modifications were made ER, e 
nature of the scales and in the controlling 
Mechanism for the various experiments. j 
The permitted error tolerance 1S aun 
by the width of the pointer in, relation ake 
Width of the lighted insert. Two tolera 


Were employed: ,016 and .100 in. si 

Time was measured separately for Lay a 
the approximate location and for making th 
final adjustment., For simplicity, how at 
results are reported only in terms of the to 
ime, 


Results! 


Experiment 1. The apparatus @S E = 
Previously in the knob studies was employ 
No. 
* This ri ecuted under Contract ? 
W33-038.a6-22561 between the Institute of Resear 
chigh University, and the USAF Air Researc T 
PAG opment Command, Wright-Patterson å 
ase, Dayton, Ohio. 
Tech tailed tabulated data can be foun 
‘Chnical Report No. 6563. 


d in USAF 


with six and twelve-inch: levers in comparison 
with a 2% in. knob. A number of shaft-turn- 
ratios were combined with the levers and 
knobs. Thirteen subjects completed the series. 
In over-all performance, the knob with a ratio 
of 4.08 inches of pointer movement to one 
complete turn of the shaft proved slightly 
superior to any of the lever-combinations, 
with a tolerance of .016 in.—the coarsest that 
could be obtained with the unmodified 
apparatus. 

Experiment 2. In order to test a much 
coarser tolerance, a new scale was constructed 
with .125 in. inserts. With a .025 in. pointer, 
this permitted a tolerance of .100 in. With a 
.109 in. pointer, the tolerance was .016 in. as 
in Experiment 1. Seven subjects, none the 
same as in Experiment 1, completed the series. 
With these subjects, there was a slight superior- 
ity of levers over knobs at both .016 and .100 
in. tolerances. 

More striking was the fact that the general 
level of total times was about two-thirds that 
of Experiment 1. Are these shorter times 
simply a matter of differences in the two 
groups of subjects? Or does the wider insert 
make for more rapid settings, even when the 
tolerance is the same? 

Experiment 3. The third experiment was 
designed specifically to answer these questions. 
A new scale was constructed with alternate 
wide and narrow inserts (.125 and .016 in.). 
Alternate runs were made with pointers yield- 
ing .016 in. tolerance on both widths of 
inserts. Six of the seven subjects from 
Experiment 2 completed the series. 

The total times are consistently shorter 
with the wide inserts (mean difference .015 
sec.), although the major difference between 
Experiments 1 and 2 must have been a 
difference in the two groups of subjects. It is 
interesting to note that a wide pointer with a 
wide insert is better than a narrow pointer 
with a narrow insert, even though the tolerance 
is the same. 

Experiment 4. To broaden the comparison 


269 


270 


between knob and lever combinations, the 
knob was combined with ratios of 4.08, 
6.28, 9.70, and 16.3; while the twelve-inch 
lever was used with ratios of 16.3 and 33.6. 
The scale with wide inserts was employed 
with tolerances of .016 and .100 in. Twenty- 
four subjects completed the series, making 
twenty settings apiece at five distances (3545, 
27, 40, and 72 sixteenths of an inch) under each 
set of conditions. Thus the over-all mean for 
each set of conditions is based on 2400 readings. 

Ata tolerance of .016 in. the knob combined 
with ratio 6.28 is slightly better than either of 
the lever combinations. At a tolerance of 
.100 in. there is little to choose. 

Experiment 5. Up to this point no lever 
longer than twelve inches had been tested. 
In order to use longer levers, it was necessary 
to construct a folding arm carrying a double 
sprocket-and-chain drive; so that the position 
of the lower end of the lever could be lowered 
as the lever-length was increased. Lever 
lengths of 4, 6, 9, 12, 18, 24, and 30 inches 
were used with ratios of 16.3 and 33.6. Fifteen 
subjects completed the series. The over-all 
mean under each condition is based on 1500 
readings. 

It became evident that the total times were 
related to the ratio between the movement of 
the lever-tip (L) and the movement of the 


pointer (P). The ratio L/P b 
ene / P can be computed 


WP: lever length in inches X 27 
inches of pointer movement for 
one complete turn of the shaft 


> The main curve of Figure 1 shows the rela- 
tion pee mean total time and L/P. The 
curve drops rapidly to an L/P of i 
three and then flattens af Jagan 
slightly with higher values. The optimal L/P 
appears to be approximately three or fou 
That is, the lever tip should travel three i 
four times as fast as the pointer. B wie 
combination of lever-length and iene 
the optimal L/P is achieved seems to te 
little difference. an 

For comparison, the mean total ti 
the knob with ratio 6.28 is shown a hele mi 
left-hand comer of Figure 1. The be a 
performances with knob and with lever torie 
out about even. 


William Leroy Jenkins and Merritt W. Olson 


L/P (Lever-tip/ Pointer) 


o 1 2 3 4 5 $ 
p T T ad 


1.60 

A Aono 

FRICTION 
(Exp.5) 


OPTIMAL L/P 


Q WITH ADDED 
o^ 
R 


F4 FRICTION 


(Exp. 


Meon of Total Times in seconds 
a 
3 
— a 


6) 
® 


A KNOB -++ = =e see 
(Exp, 
Ha p.4) 


NO ADDED FRICTION 


1.00 


Fic. 1. Total time as a function of L/P. 


Experiment 6. By means of a disc-brakes 
friction was added such as to require an & it 
tional 400 grams pull at the tip of the twelve 
inch lever. Kinesthetically, this amount 
added friction is distinctly noticeable a25, 
is described by the subjects as « moderate: 
Lever lengths of 12, 18, 24, and 30 inches were 


combined with a ratio of 33.6, giving LPA 
ranging from 2.25 to 5.61. The tolerance 
was .100 in. the 


The shorter curves in Figure 1 show 
relation between mean total times and #/, 
with and without added friction. When friction 
is added, the curve descends more rapidly i 
the optimal region remains unchanged. 


Discussion 


_ For making settings on a linear scale a, 
inches in length, the optimal L/P appear"? pe 
be three or four. It seems likely that m 
same optimum would apply to longer sealen 
although our apparatus did not permit male 

such tests. x 
3 However, with scales longer than app”? t 
imately twelve inches, an L/P of three can js 
be achieved, because the maximu™ ~~ ct 
limited by the maximum reach of the sub) oD 
As shown graphically in Figure 2, the maxim” 


The Use of Levers in Making Settings on a Linear Scale 


is approximately the same (38 in.) regardless af 
lever length. This means that the maximum 
L/P cannot exceed 38 divided by the scale 
length in inches. 


Fic. 2. Maximum reach. 


-ratio (inches of 
lete turn of the 
mula: 

in inches 


The appropriate shaft-turn 
Pointer movement for one comp 
Shaft) can be computed from the for 

Shaft-turn-ratio = lever length 


2r r -inch 
EE 7 2 ‘ith a twelve-inc 
T/B: For example, W! 


Scali A is 38/12 = 3.17 which 
fe ae the maximum L/P is 3 A a 30-inch lever 


's in the opti egion. I : 

'S used, P shaft-turn-ratio 5 
30 X 2/3.17 = 60. Ifa twelve-inch lever P 
Ysed, the appropriate shaft-turn-ratio 15 

X 2/3.17 = 24, 


Summary 


y Right-left-moving levers of 
Were used to match a pom" 
Mserts on a horizontal nine-inch ma — 
With tolerances of .016 and .100 in. 5° y 
control knobs were used for comparison: 


f various lengths 
ter to lighted 


271 


With lever lengths between six and thirty 
inches, the important factor seems to be the 
ratio of the movement of the lever-tip (Z) to 
the movement of the pointer (P). The 
optimal L/P appears to be approximately three 
or four. That is, the lever-tip should move 
three or four times as fast as the pointer. Add- 
ing friction accentuates the optimum without 
changing it: Slightly faster settings are made 
with a wide insert and a wide pointer than 
with a narrow insert and a narrow pointer, 
even though the tolerance (.016 in.) is the same. 

The maximum Z is limited by the maximum 
reach of the subject (about 38 in.). The 
maximum L/P therefore cannot exceed 38 
divided by the scale length in inches. With a 
scale longer than about twelve inches an L/P 
of three cannot be achieved. 

A rotary knob with a ratio of about six 
inches of pointer movement for one complete 
turn of the knob appears to be as good as 
any lever combination in making settings to 
coarse tolerances on a nine-inch scale. It 
seems probable that this would also hold true 
for longer scales. 


Received September 4, 1951. 


References 


1, Jenkins, W. L., and Connor, M. B. Some design 
factors in making settings on a linear scale, 
J. appl. Psychol., 1949, 33, 395-409. 

2. Jenkins, W. L., Maas, L. O., and Rigler, D. Influ- 
ence of friction in making settings on a linear 
scale. J. appl. Psychol., 1950, 34, 435-439. 

3. Jenkins, W. L., Maas, L. O., and Olson, M. W. 
Influence of inertia in making settings on a linear 
scale. J. appl. Psychol., 1951, 35, 208-213. 


Dimensional Analysis of Motion: III. Complexity of 
Movement Pattern? 


Gerald Rubin, Patricia Von Trebra, and Karl U. Smith 


University of Wisconsin 


The effect of varying the complexity of 
motion upon the efficiency of skilled move- 
ments is a fundamental problem in modern 
industrial work. But systematic study of the 
effects of different complexities of movement 
upon the duration of a motion pattern has 
never been conducted. The present investiga- 
tion,’ which represents part of a program of 
research on the analysis of work motions, 
deals with the problem just mentioned, that 
of the interrelation between complexity of 
motion pattern and economy of movement. 

For experimental purposes, it is essential to 
define the term, “complexity of movement.” 
Obviously, it is possible to vary the relative 
difficulty and complexity of a motion pattern 
by changing the environmental stimuli that 
effect reactions, the physiological conditions 
related to the task, or the past learning and 
other dynamic factors which define the motion. 
But in terms of the characteristics of the 
actual movement itself, the relative complexity 
of motion can be varied in only a limited 
number of ways. Specifically, the motion 
complexity may be varied in terms of the 
following dimensions: (1) the total number of 
unit movements; (2) the pattern of manipula- 
tion; (3) the plane of movement; (4) the 
directions of movement within a given plane; 
and (5) the handedness or laterality relations 
within a repetitive pattern of uniform manip- 
ulations of the movement pattern. 

In this experiment, the effects of varying 
the number of directional changes in reactions 
in a single plane of motion upon the efficiency 
of a movement pattern have been investigated. 
New techniques of analysis of human motion, 
which have been developed for such dimen- 
sional study of movements (2), permitted 
variation in the relative number of directional 
changes in the movement to be made, without 
altering materially the distance of travel of 

1 The cooperation of Omer Jones in the initial conduct 
of this experiment is appreciated. 


2 Supported in part by funds received from the Gradu- 
ate Research Committee, University of Wisconsin. 


the motion, the characteristics of manipulation, 
and other factors which affect the measured 
properties of the reactions. In addition, the 
effects of practice on the different complexity- 
levels of motion used were investigated. 


Experimental Procedure 


The motion studied in this experiment is 2 
simple work pattern consisting of switch- 
turning movements used in the operation of a 
large control panel. The special techniques 
used in the study permitted separate measures 
of the basic components of travel time an 
manipulation time in addition to the over-a 
performance time. The procedure of the study 
is to measure simultaneously the travel an 
manipulation times of the movement pattern 
under four conditions of complexity. 

The apparatus used to conduct this study 
was the Universal Motion Analyzer (Figure 1). 
The planned work situation of this particular 
arrangement of the apparatus consists of a large 
control panel of separate rows and columns 0 
turn switches. The switches are operated by 
the subject in predefined patterns. In this 
apparatus, the subject constitutes a part © 
the electrical circuit in such a way that his 
touching and release of successive switches 
operate two different time clocks. As soon 
as the subject touches a switch a manipulation 
time clock is started. When this switch iS 
released, the manipulation time clock is stoppe 
and a travel time clock is started. Upon touch- 
ing a second switch, the travel time clock is 
stopped and the manipulation time clock is 
again started, and so on. 

Figure 2 describes diagrammatically the four 
levels of complexity used in the experiment: 
In this diagram, the small circles refer to the 
particular switches turned, and the dotted lines 
indicate the movements traversed. It shoul 
be noted that all four patterns of movement 
are carried out in the frontal plane. In terms 
of the preceding definitions, pattern A is th® 
least complex, with zero directional changes" 
and D is the most complex pattern, with thre? 
directional changes. Colored guide strin8® 
were used to outline the four different patterns: 

Table 1 summarizes the experimental d 
sign.? The subjects used were 48 right-hande 


sons 

5A parallel study of repetitive pin-pulling motion 

was carried out along with this experiment. The aa 
of this study are not to be published because it “ 


272 


Dimensional Analysis of Motion: IIT. Complexity of Movement Pattern 273 


Fic. 1, ph j pera 
‘1. Photograph of a subject oper 
onnections between the subject and the apparat 


tained į i 
ed in a separate housing. 


Colle, + A ials r day 
follege Women, who received 4 trials per oy 
he days in each of the 4 patter i with 
thers All performances we E presenta- 
2e dominant ri and rder 0 

right hand. B- vë 
a of the four different complexit® po ts, 

nt was randomized completely DY $ 


15.2 om, 
So 1010-40 
aff, 


O-r-0rO 
Ý 


ie) 
ame 
O-v-i 
? 0-70 
? óO 
Ò 


D. 


F S a 
Us. ; iment. 
Ha The patterns of motion used in the experi™m 
oun 
and n ae such minor factors as length of, 
in the gen fingernails produced very marke pig 
Note jp eta. These observations, however, Ebility of 
Pegbony Connection with the problem of relia 
Td psychomotor tests. 


fingernails 
variations 


ting a control panel attache 
us are not shown, Time clocks and electronic units are con- 


d to the motion analyzer. The electrical 


The critical data of this experiment are the 
separate measures of manipulation time and 
travel time recorded in hundredths of a second 
for each pattern of movement made by each 
subject during each trial. In order to treat 


Table 1 


The Plan of Experimental Observations 


Days** 


Trials (16) and Patterns 


Subjects of Movement (4)* 434 


1 AAAA BBBB CCCC DDDD So ee 
cccc AAAA DDDD BBBB -=-= = 
DDDD BBBB AAAA CCCC mi me 
BBBB AAAA CCCC DDDD --- 


RUN 


48 DDDD CCCC BBBB AAAA Sc 


*In the sixteen trials on each day, the order of 

resentation of the four movement patterns, A, B, C, 
and D, is randomized by subjects. 

+*+ The order of presentation of movement patterns 
for a given subject is maintained during the four days 
of practice. 


274 


imental data, a median score was 
mee each block of four trials on each 
pattern of movement as run on a given day. 
This median measure was used to eliminate the 
effects of a few extreme scores caused by block- 
ing and other indeterminate factors in the per- 
formances of the subjects. 


Results 


Separate analyses of variance were carried 
out for travel and manipulation time data.‘ 
These analyses brought out the fact that only 
individual differences and practice effects were 
statistically significant. Complexity of the 
movement pattern, which was varied in terms 
of the number of directional changes in the 
pattern, did not produce significant differences 
in the data. Also, none of the interactions 
between conditions of complexity and other 
variables, especially days of practice, showed 


TIME 


Om lintpulation 
Om ws ee) riw 


n=O 
= os 


oo 


c D 


B 
CONDITIONS 


Fic. 3. Manipulation and travel time as a function 
of conditions of complexity of motion. Time is ex- 
pressed in seconds. f Manipulation time shows some 
variation with conditions, but neither of the two aspects 
of movement varies significantly, 


4 The summaries of analyses of variance have been 
deposited with the American Documentation Institute. 
Order Document 3429 from American Documentation 
Institute, 1719 N Street, N.W., Washington 6, D. C. 
remitting $1.00 for microfilm (images 1 inch high on 
standard 35 mm. motion picture film) or $1.00 for 
photocopies (6 X 8 inches) readable without optical aid. 


Gerald Rubin, Patricia Von Trebra, and Karl U. Smith 


TIME 


Om wwirotation 
Oam mE 


DAYS 


Fic. 4. Manipulation and travel time as a functio? 
of days of practice. Time is expressed in seconds: 
Manipulation time varies significantly with practice. 


significant variations. The effects of orde" 
of presentation of the conditions did 2°! 
produce significant variation in the experiment 
or noteworthy influences on other condition* 

The concrete results of the experiment m4Y 
be summarized with reference to some of ey 
actual differences observed. The main obje¢ 
of this study was to determine whether increas” 
ing the complexity of a movement pattern, ? 
terms of its directional complexity, will pring 
about an increase in the time of either E 
manipulation or travel components of tH 
movement. Figure 3 illustrates the changes ™" 
manipulation time and in travel time wh! 
were found when the movement complex! 
was varied in the four conditions. Tr 
time shows almost no change throughout pr 
four conditions. Manipulation time of Ee 
movement pattern varies, but, as alre? 
noted, this variation is not large enough t° e 
of any significance. The implication of theni 
results for principles of motion economy W 
be mentioned later in the discussion. 


Dimensional Analysis of Motion: III. Complexity of Movement Pattern 


A second major objective of this study was 
to determine the effects of practice on motions 
of different complexity. Figure 4 illustrates 
the changes that occur in the manipulation 
and travel components of motion as a function 
of practice over a period of four days. As 
hoted above, the effects of practice produce 
significant variation in the experiment, but 
this statement applies only to the manipulation 
Component of the motion pattern. The effects 
of practice on the travel component of the 
motion are barely significant. As a result of 
Practice, manipulation time in the switch- 
turning motion is reduced by approximately 
30%. Travel time is reduced by approx- 
Mately 10%, and this change occurs, almost 
Chtirely during the first day of practice. In 
other words, this experiment indicates that 
the effects of learning on the type of work 
Motions studied here have quite distinctive 
effects on different components of the motion 
Pattern, 

In addition to the data obtained concerning 
the effects of different complexity of moyenn 
and of practice, this experiment has provide! 
information concerning the relation between 
Components of movement under the conditions 
Studied. A correlation value has been com 


Puted between the travel and manipulation 


“spects of movement, and has been es Ke 


aN +0.29. This correlation is significan 
e S% point. 


Summary 


Tt has been claimed for many years in me 
ae of motion study that variations e 
sey of a motion pa g 
one 
of 1 
Boy ment, the more efficien $ 
can hee experiments of @ sys 
ingf T found which, in accordance 
Strate definitions and observ 
ewe conditions of comp 

lons in efficiency of motion- 


dens up the necessity of haih api 
ae ining of con® 
oi ai tier to A a the conditions 


s m order to understar 
a cient movement. As far #8 
ions studied here are concerned, V 


the work 
ariations 


275 


in the complexity of a movement pattern in 
terms of the number of directional dimensions 
involved do not produce noteworthy changes 
in the efficiency of the manipulation and 
travel components of the movement. It 
remains to be determined by future research 
what other factors of complexity of motion do 
influence economy of movement if changes in 
such complexity do, in fact, define the level 
of economy of reactions. ‘ 

A previous study (3) has brought out the 
fact that, in the type of controlled motion 
patterns studied here, practice does not affect 
the manipulation and travel components of a 
motion pattern in the same way, and that the 
effects of practice do not interact significantly 
with other conditions of motion. This study 
therefore substantiates previous work in show- 
ing that it is mainly the manipulation compo- 
nent of a motion pattern that is influenced by 
practice, and not the travel component. 

It is believed that the result just mentioned 
may be a clue to the answer of many of the 
problems of specificity in skill learning, and 
of the lack of generality of principles of learning 
in the practical control and understanding of 
psychomotor activity. The fact is often 
recognized that some tasks, for example, 
tracking behavior, rate of tapping, and 
throwing activities, typically display very 
limited practice effects. In these activities, 
travel or free-thrown movements predominate. 
Furthermore, the present data point up a 
fundamental weakness of the traditional 
generalized approach to the study of psycho- 
motor learning, as well as concepts which have 
been derived from such an approach. The 
rules defining the course of learning are quite 
different for the different primary components 
of activity in the psychomotor task. 

In a previous study (1) of a heterogeneous 
group of right- and left-handed subjects, it 
was observed that the travel and manipulation 
characteristics of movement do not correlate 
significantly with one another. A more homo- 
geneous group of subjects of about the same 
size, as used in this experiment, has given a 
correlation of +0.29 between these two aspects 
of the motion pattern. This low correlation 
value of limited statistical significance (5 per 
cent level) suggests that a real problem of 


276 Gerald Rubin, Patricia Von Trebra, and Karl U. Smith 


reliable measurement exists in the analysis of 
psychomotor activity and that this problem 
may be avoided only by separate measurement 
of the unrelated components of the work task. 


References 


1. Davis, R., Wehrkamp, R., and Smith, K. U. Dimen- 
sional analysis of motion: I. Effects of laterality 
and movement direction. J. appl. Psychol, 


The principles and techniques of motion 1951, 35, 363-366. 

R . : 2. Smith, K. U., and Wehrkamp, R. A. Universal 
study described here provide the basis for : motional analyzer applied to psychomotor per- 
such new approaches to the study of reliability formance. Science, 1951, 113, 242-244. 


of performance in skilled movements. 3. Wehrkamp, R. A., and Smith, K. U. Dimensional 


analysis of motion: II. Travel-distance effects. 


Received August 4, 1951. J. appl. Psychol., 1952, 36, 201-206. 


i 


Book Reviews 


Cantor, N. Learning through discussion. 
Bufalo 2, New York: Human Relations for 
Industry, 1951. Pp. 108. 


Professional industrial relations personnel 
who depend upon business and industry for 
their livelihood face a problem of great 
magnitude. This problem consists im the 
main of translating the results of scientific 
studies in the area of psychology and related 
fields into language and techniques usable 
by the layman. ; 

People in the ‘‘in-group” are always suspi- 
cious of those in the “out-group.” The people 
in the business and industrial world are 
generally suspicious of what they consider 
to be “long hairs” and “impractical dreamers. 
The problem becomes more acute when the 
representatives of the “out-group” use lan- 
guage that only the most learned can under- 
Stand and fewer can appreciate. 

In addition, most “personnel” men are 
really very busy. They don’t have the time 
to scan the journals, tie together the various 


ise a 

Microscopic bits of research and ed Sit 
technique which will have immediate val 

uation. 


the business and industrial situatio ™ a. 
Nathaniel Cantor, the sociologist, | a 
rendered a major service to the gris 
Personnel worker in the publishing hi es 
namics of Learning and, more _ mg ia 
earning Through Discussion, whic 


Subject of this review . 
: ess 
Of all the areas in industry and busin 


Where the most work needs to pe gone 
Where the potential results ar? ee 
Promising is the area of training bg ‘ke 
Provided a practical tool for wor a a 
training area. This tool offers some, T E 
deing understood by his non-techic® ba 
leagues and by industrial supervisor 
executives, 
the © states in the preface, 
x Professional discussion 
{nother book on ‘Conference Leads oP ider 
Ow to Lead a Discussion’ wherein t gis list 
finds listed the things to say OT to 3 ‘teat Pi 
types of questions to ask will be 
There jg no ‘code’ of the conferre 
wherein he is supposed to promis 
Ties of ‘Į wills’ or ‘I will nots. 


ith 
“This book deals wi 
tender. It is not 
dership’ or 0n 


preparing a discussion will not be found. 
There are no check lists. 

“The usual manual for leaders contains 
many such guides which are probably of 
some value to the beginning discussion leader 
who feels insecure and uncertain about his 
function. Listing what he should do and 
what he should avoid is a poor substitute for 
lack of knowledge and skill in the art of 
discussion leadership. Indeed, such devices 
block the development of personal skill. A 
concert artist need not be told how to play 
scales... . 

“The present approach deals with the 
fundamental problem, ‘What takes place 
psychologically when a group of people, 
directed by a leader, meets to discuss a 
problem?’ Another way of stating this is to 
ask, ‘How do members of a group learn or 
change, by the exchange of opinions through 
argument, or by discussion on an intellectual, 
verbal basis? Is learning a matter of reasoning 
or does it involve much more?’ 

“Tt is my firm conviction that a discussion 
leader cannot perform in a professional way 
unless he understands what is involved in 
learning. To acquire such understanding and 
the skill to employ it are no simple matters. 
Ideally, there should be instituted a profes- 
sional program for the development of discus- 
sion leaders. Until such time, we will have 
to rely upon the meager literature on discussion 
leadership and the unsatisfactory and hurried 
‘in-service’ programs.” 

This approach is where the major contribu- 
tion of the book lies. It presents techniques 
for the application of sound learning without 
bogging down in technical language and 
formulae. It provides the professional train- 
ing director with a readily usable tool for the 
training of discussion leaders. It is this non- 
professional teacher who makes possible a 
sound educational program which can be 
continued on an economical basis. 

Cantor’s use of Kelley’s basic educational 
fallacies to develop the chapters on Learning is 
excellent. It strikes home and is appealing 
to anyone who has “escaped” from our public 
school system. 

The chapters which follow flow naturally 


277 


278 


into the discussion of the Dynamics of Discus- 
sion and the Skills in Discussion. He points 
out, “Most of the members of a discussion 
group are not likely to be disciplined social 
scientists. They are not accustomed to con- 
sider issues in terms of the logic of ideas. 
They bring to the discussion their stereotyped 
‘rights’ and ‘wrongs,’ ‘goods’ and ‘bads,’ their 
‘either-or’ views of social behavior. They are 
not in the habit of examining the hidden 
assumptions or biases in their points of view. 
They interpret social events and individual 
behavior in terms of their more or less fixed 
attitudes. They tend to confuse the observa- 
tion of premises with their moral judgments 
about them. 

“One of the responsibilities of the discussion 
leader is to help them learn to look at events 
from many different and conflicting points of 
view. As new facts and new interpretations 
arise, previous attitudes may be modified.” 

It is this type of presentation of psychol- 
ogical truths in simple language that makes 
the book so useful. If more of the body of 
research which has been published in group 
dynamics, leadership, counseling and psycho- 
therapy can be translated into easily used 
technique and presented in simple language 
in the way Cantor has done, the business and 
industrial world would stand up and cheer. 

Howard P. Mold 

Director of Training, 

Minneapolis-Honeywell Regulator Company 


Thorne, F. C. Principles of personality coun- 
seling. Brandon, Vermont: Journal of 
Clinical Psychology, 1950. Pp. 491. 


For the most part, this book represents a 
collection of the series of articles on counseling 
and psychotherapy published by Dr. Thorne 
over a number of years, mostly in the Journal 
of Clinical Psychology. In most cases the 
articles have been modified or combined to 
conform to the organization of the book. 
Here and there transitional and integrating 
discussions have been inserted, Thus, this 
work does not represent a new contribution to 
the literature, but does provide a more 
effective vantage point for an over-all view 
of Dr. Thorne’s contributions. 

For the counselor it has certain definite 


Book Reviews 


though limited values. In so far as there is 
a trend toward interpreting counseling as @ 
clinical process designed to serve broad 
preventive and positive mental health func- 
tions, the broad range of treatment problems 
discussed with the emphasis on the borderline 
neurotic and relatively normal individual will 
meet a growing need. On the other hand, 
many of the discussions and cases are oriented 
toward problems where psychological and 
physiological factors are closely intertwined. 
These aspects of the book will have more 
immediate application in psychiatric practice 
or in psychological practice in medical agencies 
than in the usual practice in psychological 
counseling. Particularly illustrative of this 
limitation is Chapter 3 in which discussions of 
testing and diagnosis are limited to evaluation 
of psychopathology. All counselors, including 
those who confine themselves to helping 
people make decisions, will obtain food for 
thought from the many emphases on cognitive 
processes in helping people with personal 
problems, e.g., imparting information, percept- 
ual training, and dealing with memory, The 
last two sections are devoted to maximizing 
intellectual resources and to methods of 
intellectual reorientation, respectively, Too 
often, an interest in “depth” factors in 
personality leads to unwise neglect of cognitive 
intellectual elements in personality adjustment. 

The author is long on discussions of practical 
problems, almost fanatical in expressing his 
devotion to science and eclecticism, but short 
in the theoretical coherence of his various 
concepts. This reviewer had the impression 
that it would be easy for readers to acquire 2 
Series of undigested somewhat contradictory 
ideas about counseling and other therapeutic 
phenomena as a consequence of the book’s 
rampant eclecticism. On the other hand, 
discussions such as those of practical problems 
of professional responsibility (chapter 5) and 
of creating an effective counseling relationshiP 
(chapter 8) will prove useful to many begin- 
ning counselors. 

In summary, many parts of this book ca” 
provide useful supplementary reading f0" 
didactic courses in counseling. 


Edward S. Bordin 


University of Michigan 


Book Reviews 279 


Hathaway, S. R., and Meehl, P. E. An allas 
for the clinical use of the MMPI. Minne- 
apolis: Univ. of Minnesota Press, 1951. Pp. 
xli + 799. $9.75. 

Despite the title, the Ailas does not deal 
With clinical uses of the MMPI. It consists 
of 798 brief case histories abstracted by the 
senior author from the clinical records of 
in-patients at the Psychiatric Unit of the 
Univ. of Minnesota Hospitals. These are 
supplemented by 170 cases, obtained from 
eleven other sources, including prisoners, 
college students, Veterans Administration 
hospital patients, guidance clients, and patients 
in an English hospital. The histories are 
factual rather than interpretive in nature. 

Each of the 968 cases is headed by one oF 
More MMPI profiles and related diagnostic 
and descriptive data. The profiles have been 
reduced to a code which summarizes the form 
or shape and gives some information about 
the intensity or elevation. The cases are 
arranged according to this code and ite 
extensively indexed and cross-indexed. ‘This 
enables the user to look up cases On the ye 
of the MMPI pattern and provides materi 
on all the usual profile configurations an 
many atypical ones. 

aha the reader will have to hage pri 
considerable experience with the MMPI z5 
Ne can profitably use the Allas. The aa a 
Slon prerequisite to an informed and er A 
ticated use of the instrument 1 not provide 


a the book. For those unfamilar cach 
‘terature a 259-item bibliography articles 


November, 1950) is given and six 
are starred as recommended reading. N 
There are eight figures showing  & T ‘ 
orm the relative frequencies of Glee 
Codes for both male and female psy oe 5 
Patients, normal adults, college sulle ab 
ninth-graders. Eight tables show for 


jnations 
a groups the frequencies ofall oo Je, 
highest and lowest scales m ate ae 


Particular interest are other s 
two-digit codes for fifteen common dinen i 
Sroups contrasting their relative frequ- ae 
With those obtained in a normal poPu™ 
in a general clinical sample. 
inh here are a great many 
erent in the tabular and tex 


relationships 
tual materia 


which the authors have not made explicit. 
It may be expected that other workers will 
carry out researches on the contents of this 
volume much as governmental and census 
statistical reports are often utilized. This 
would require evidence for the accuracy and 
representativeness of the samples and case 
material presented; such evidence is, unfor- 
tunately, at present not available for evaluat- 
ing the book from this standpoint. 

The coding employed in the Aé/as has two 
major disadvantages: first, scales with T-scores 
from 46 to 55 are not coded; second, “low” 
scales with scores below 46 are coded from 
lowest to highest—just the opposite of the 
“high” scales. This reversal led to awkward- 
ness if there is more than one low scale. 
The first deficiency causes variability in code 
length if any scales lie in the uncoded range. 
Also it may be impossible to tell which is the 
lowest (or in some cases the highest) scale if 
more than one scale is uncoded. This diffi- 
culty is exemplified by Table I, page xxvii 
where 50.4% of 710 male psychiatric patients 
have their low points uncoded; there is no 
way of determining whether these low points 
are distributed in the same proportions as 
those with coded low points. The utility of 
all the tables and figures is reduced because of 
this shortcoming. 

The Atlas should encourage clinical workers 
to utilize the profile patterning and configural 
approach to the MMPI and should discourage 
the unprofitable adherence to the diagnostic 
terminology of the individual scales. It will 
then be possible to determine empirically and 
without psychiatric bias the personality cor- 
relates—both normal and abnormal—of the 
various profile patterns. 

George S. Welsh 

Velerans Administration Hospital, 

Oakland, California 


Read, Herbert. Education through ari. N.Y.: 
Pantheon Books, 1949. Pp. xxiii + 320. 
$5.50. 


This book will interest psychologists prin- 
cipally because it sets forth an eclectic theory 
of personality development through creative 
expression. Read develops an idea as old as 
Plato—that art should be the basis of any 


280 Book Reviews 


sound education—and offers it to the modern 
world as a solution for society’s ills. 

Read’s material on sensation, perception and 
imagery represents classical psychology. It 
bears the distinct impress of British associa- 
tionism, shows the color of Gestalt theory, 
but has little mark of American behaviorism 
or neo-behavioristic viewpoints. His person- 
ality theory is chiefly based on types which 
bear the hallmarks of Jung and Jaensch; 
he tends strongly to speak in terms of 
dichotomies. 

Read’s position on art in relation to thinking, 
discipline, and moral training illustrates several 
diffuse but pronounced trends in modern 
education. It is antirationalistic, emphasizing 
an intuitive approach to truth. Certain 
aspects of experience exist in the nature of 
things (e.g., balance, symmetry, proportion, 
rhythm), and are apprehended because they 
“feel right” to the observer. The tendency 
to empathy and abstraction in art is deeply 
rooted in human nature. The artist, not the 
logician, is close to the personality and mind of 
the child. Education consists in nurturing 
the relatively pure tendencies of the child 
mind and avoiding the harm which comes 
through ordinary educational procedures. 

Read’s theory also holds that productive 
thinking in the scientific sense is closely related 
to artistic production and may be governed by 
the same principles. The elements of intui- 
tion, discovery, “insight” are given an import- 
ant place. Today, when many persons are 
becoming concerned with the nature of 
scientific creativity, this antirationalistic def- 
inition of scientific thinking is more than a 
curiosity. One notes many kindred viewpoints 
in modern thought. Whether the psychol- 
ogical reader agrees with Read’s position or 
not, he might well try to discover why such a 
position recurs and why educators appear 
receptive to what in earlier years was con- 
sidered so definitely unscientific. 

Read derives a series of categories for the 
evaluation of child art and advances a theory 
of art production in relation to human personal- 
ity which will interest those concerned with 
projective techniques. He also offers a blue- 
print for adapting his educational and person- 


ality theories to the education of children 

and adolescents. 

Dale B. Harris 

Institute of Child Welfare, á 
University of Minnesota 


New 
1951. 


Link, Henry C. The way to securily. 
York: Doubleday and Company. 
Pp. 224. $2.50. 

In an article written in 1939, I quoted H. A. 
Murray to the effect that the kind of answers 
that satisfy people’s curiosity and meet their 
psychological needs was being provided not 
by psychologists but by medical men. To 
make the observation adequate today, one 
must add to medical men, preachers, priests, 
rabbis, and chautauqua lecturers. Henry C. 
Link, however, is one psychologist who has 
consistently written books that satisfy people’s 
needs. Book reviews in newspapers an 
magazines and sales records testify to the 
popularity of The Relurn to Religion, The 


Rediscovery of Man, and the Rediscovery of 


Morals. They have not been equally satisfy- 
ing to his fellow psychologists for at least tw 
reasons: (1) they are anecdotal jn style; and 
(2) they describe and explain in terms of the 
Spirit and things spiritual, entities which are 
not encountered in a life time of work in the 
laboratory. 

Link’s most recent book, The Way t° 
Security, is in similar vein and promises tO 
Satisfy the one group and dissatisfy the other 
as before. Its title reflects the keen sense of 
the marketing expert, but its content derives 
from a long experience as psychologist, philos- 
opher, father of a family, personal counselor 
and sympathetic observer of people. As in 
his earlier books, he is in revolt against 
present day matérialism, 

„Link notes that with all the current pt” 
visions for security by society and by the 
individual, the sense of insecurity was never 5° 
great. The reason is that security has come 
to mean social security and material security 
whereas the basic and satisfying security is 
personal and spiritual. The latter cannot 
derived from the former, Why has the search 
for Security shifted from the personal a? 
spiritual to the material and social? Why has 
security come to be something to be receive 
rather than something to be achieved? Link 


Book Reviews 


finds the answers to these questions in twenty- 
Seven pseudo-scientific fallacies, whose dis- 
cussion and illustration constitute the content 
of his book. He finds these fallacies in the 
home, in the church, in the schools, and in the 
government. These fallacies lead to the 
building up of an insecure personality devoid 
of self-reliance. 

Link goes to the dictionary for a definition 
of security and finds that “to be secure” means 
“to be fastened to.” Taking off from that 
Meaning he shows that young people today 
have nothing to fasten on to as earlier genera- 
tions did. ‘The old fixed principles of right 
and Wrong, the ten commandments and the 
Judaeo-Christian ethics have been replaced by 
the concept of “let your conscience be your 
guide.” To bewildered parents who ask him 
to recommend a book on how to bring up 
children he usually replies: “If you will do 
With your children pretty much what your 
Parents and grandparents did with theirs, it 
Will be better than any modern book I can 
Tecommend. It may not be perfect but at 
least it will not be bad” (page 50)- “The 

concerning religious faith he says: , r 
asis for authority and security 1S & faith m 
es moral principles and standards. That 
@ cal nas been severely shaken, bringing about 
kee decline in discipline. Now an 
the ening mass of evidence proves that 
aith of our ancestors was not entirely 


o plag ed. Neglected principles that were 
“os in the making are being rediscovered 
reinterpreted in terms of modern life” 
Page 56). 
aits most severe criticism against formal 
inte is that though it educates the 
Stud ect it does not develop personality. If 
he ents wish “to develop mature character 
chur must rely on their families and the 
e ch. Inadequate though these may be, 
publi are still far superior to the forces of 
ihe education. Indeed, since the training 
Ping must be largely based on religious 
trainin les, the public schools cannot do this 
educate Properly even if they would. Religious 
ompatin 1s, for certain reasons, no longer 
sic oe with secular education” (page 135). 
St on education seems to Link to offer the 
Moder edium for spiritual education within the 
n educational system (page 138). 


281 


The Welfare State comes in for severe 
indictment because it is directed toward 
material and social security with the impov- 
erishment of personal and spiritual security. 
“|. . a government cannot assume responsi- 
bility for people’s welfare without profoundly 
affecting their moral fibre. To the extent that 
government takes care of him, to that extent 
the adult citizen is deprived of the moral 
responsibility for himself. And since a govern- 
ment without religion tends to be neutral in 
its moral standards, it cannot make such 
standards a condition of its welfare payments. 
The net result is the progressive breakdown in 
the moral standards of all who participate in 
the welfare state” (page 224). 

Space does not permit comments on the 
many specific topics that Link discusses and 
illustrates from his extensive counseling ex- 
perience, such as: the need for love, authority 
and discipline; sex, love, marriage and children ; 
the psychology of fear and worry; sleep and 
relaxation; the mature character; war and 
security; the threat of communism from 
within; and the dollar as a gauge of spiritual 
security. 

The last few sentences of the book give at 


once the causes wnlerlyinga sense ah nseeatiy 
and the remedy for it. They ates “By one 


false theory after another we have emasculat 1 
the concept of God to its present Dates 
stature. Still further, we have created God ia 
our own image and can now contemplate 
him without a sense of fear or sin Sucl 
complacency is not conducive either to s ite 
ual or material security. ani 
“Nevertheless the good news that God 

to save man because man cannot say al 
brings increasing conviction. 
conviction also comes the cert: 
can be saved if he will make tt 
God halfway” (page 224), 


A.T. Poffenberger 


e himself 
And with this 
ainty that man 
he effort to meet 


Montrose, N. Y. 


Institute for Human Adjustment 
of psychological counselors. Re 
pea held at Ann Arbor, Michigan, July 
eg a Ne J anuary 6 and 7, "1956. 
oe, * “niversity of Michigan Press, 
This significant qi 

a conference held at 


Training 
Report of a 


ocument is the report 
4 3 of 
the University of Michigan 


282 Book Reviews 


to discuss “the training of counselors whose 
background is to be primarily psychological.” 
The conference was planned by the Executive 
Committee of the APA Division of Counseling 
and Guidance and the University of Michigan 
Bureau of Psychological Service, with the 
help of the National Institute for Mental 
Health, USPHS. Twelve persons met in 
the two conference periods, representing the 
National Institute for Mental Health, the U. S. 
office of Education Division of Occupational 
Information and Guidance, Clinical Psychol- 
ogy, an industrial counseling agency, and the 
APA Division of Counseling and Guidance. 
Care was taken in sampling from the last 
group to represent proponents of “informa- 
tional and general student personnel proce- 
dures” as well as those favoring “objectives of 
emotional and personality development” (p. 4). 

The conference procedure and report prep- 
aration (see pp. 4-5) seem to have permitted a 
useful synthesis of participant contributions. 
Perhaps the most valuable outcome is the 
description of psychological counseling and of 
levels and content of training. Psychological 
counseling is viewed as a generic method of 
helping an individual who has problems arising 
from his interpersonal relations to (1) increase 
the accuracy of his “self-percept,” (2) increase 
the accuracy of his “environmental percep- 
tions,” (3) integrate “self-percepts” and “en- 
vironmental perceptions,” (4) obtain relevant 
information, and (5) improve in his capacity 
to plan and execute behavior effectively out- 
side of the counseling situation. As might be 
expected in a preliminary statement, much of 
this material needs to be revised. By what 
criteria does one test whether one is “increasing 
the accuracy of the individual’s self-percept” 
(p. 7) or “aiding the individual to make a 
better—in the sense of happier—adjustment 
to his life situation” (p. 10)? 

Recommended levels of training include: 
(1) the “part-time counselor” with one year of 
graduate work, two-thirds of which would be 
in psychological training, (2) the “psychological 
counselor” with two years of graduate work, 
primarily psychological, approximately one- 
third of which would be in counseling theory 
and practice, and (3) the “counselor-psychol- 
ogist” (why not “counseling psychologist’) 
who would be trained at the doctoral level, 


The conference report is particularly helpful 
in describing training and standards for Levels 
(1) and (2). The doctoral program is_only 
briefly discussed. 

Of course, some important issues have not 
been resolved. For example, one infers that 
there was sharp dissension over the require- 
ment of teaching experience prior to the 
practice of psychological counseling in an 
educational setting (p. 16). There does seem 
to have been agreement, however, that for the 
counselor in training there should be “‘some 
members of the teaching staff with adequate 
background in both theoretical training and 
practical experience” (p. 24), and “that no one 
should attempt to counsel independently until 
he has had a period of closely supervised 
experience” (p. 18). 

Exception may be taken to “the prevailing 
opinion . . . that formal learning theory was 
not at the stage where it had any direct 
contribution to the training of this type of 
applied worker” (p. 17). This flat pronounce- 
ment is unfortunate in the face of the many 
relevant and seemingly helpful attempts to 
integrate learning hypotheses with counseling 
procedure which have been provided in recent 
years by such writers as Dollard and Miller, 
Mowrer, Magaret, and Shoben. 

In conclusion, the participants—especially 
E. S. Bordin, conference chairman and report 
editor—are to be congratulated upon their 
preliminary job of definition and prescription: 
The report should help to bring about & 
needed clarification of counselor activities 2? 
training within the Division of Counseling 4? 
Guidance, APA, and among related profes- 
sional groups who may perform “psycho 
logical counseling” functions. 

Harold B. Pepinsky 

The Ohio State University 


Boder, David P. I did not interview the deat 
Urbana: University of Illinois Press, 19 
Pp. xx + 220. $3.50. 


Initially, Dr. Boder’s project seems fascinat” 
ing. In 1946 he travelled to several of tP? 
European displaced persons camps, electrical 
recording his interviews with the inmat© 
concerning their recent experiences. AS ‘ 
group, these DP’s have gone through a set g 


| 
\ 


+ 


Book Reviews 283 


conditions more extreme, more potentially 
destructive of adjustment patterns than any 
to which a sizeable group of humans in recent 
history’ has been exposed. The plan of 
obtaining research material from them in this 
form is potentially profitable. But to be 
worth while, the venture must satisfy a few 
asic requirements in aims, sampling and 
controls. Similarly, this volume as a report of 
his work must meet other requirements to 
make any scientific contribution. sF 
The report does not state any definitive 
Purpose other than a general one of getting 
research materjal. If this reflects the actual 
ack of guiding hypotheses in the collection 
of the data, then it is difficult to see how the 
material can serve adequately for either a 
nomothetic or an idiosyncratic analysis. To 
“se the information to characterize some group 
the analyst needs some description of the 
Sampling model used. He reports only, “In 
à, One hundred and twenty hours of interview- 
Mg Were recorded, covering stories of about 
ansventy people, representing nearly all creeds 
‘nd nationalities in the DP installations in 
© American Zone. Seven complete narra- 
tives and parts of one, the longest interview, 
Tee Teported in this book” (p. xiii). Of these 
arenei are Jewish, one Mennonite and one 
teek Orthodox. What generalizations can 
Ormed on these selections? , 
„the author is offering this material only 
r its value in generating hypotheses, fe 
lege CMS to have defeated his purpose in a 
at two Ways, First of all, in the meager 
‘aa of material printed in this ie 
Ww ce the whole seventy cases themselv 
pou ave been very thin fare m contempo- 
toy, Sociological research. Secondly, he ov 
Xperj TeJecting some cases peruse a 
insta ences seemed too unusual, desir: ng 
explony the rank and file DP’s. A hire 
Which atory survey, however, it is the ¢ 
«| are desired to set some limits to 
| S Of phenomena in the domain. she 
Serious mation of the available a be 
Materi hee on the worth of the unpul Aa 
a cover if he did not control the un ee 7 
the co "age any better on the others. M 
mon principles of obtaining testimony 


have been violated. In rejecting what he 
calls “the usual ‘pencil and paper’ method of - 
interview,” he has thrown out most of the 
measures designed to assure adequate coverage 
of material in all cases. As-a result, when the 
reader attempts to apply Dr. Boder’s “Trau- 
matic Index,” which is given in an abridged 
form only, he finds it impossible because the 
cases are not sufficiently comparable. 

Nothing is known of the cases but what is 
obtained in the interview. Consequently no 
checks on the range of possible errors of this 
method of collection of data are possible. 
Conceivably, there does remain the possibility 
of intra-interview analysis, treating the record- 
ing as a personal document. The author 
suggests this possibility when he mentions 
evidence of traumatic disruption of their 
language habits, or the presence of inconsist- 
encies within the records. Here too the 
analyst is frustrated by the clumsy and hap- 
hazard interviewing methods used. Time 
and again, the subjects start relating in their 
own way some personal experience only to be 
interrupted by seemingly trivial or irrelevant 
queries, asides or equivocations. These are 
so frequent as to distort seriously the whole 
set of documents. 

These and similar errors are sufficiently 
damaging to detract from the worth of this 
book for psychology, sociology or anthropology. 
Perhaps this was not the actual aim of the 
author’s efforts, however. A key to the 
underlying motivations seems to lie in the 
introductory paragraph on page xi where he 
mentions the “significance of Preserving for 
posterity the impressions and emotions aroused 
by the sight of thousands of victims . |.” 
(italics reviewer’s). What is most valued 7 
not the behavior of the subjects interviewed 
but the behavior of the audience in its reaction 
of outrage. As such it is not scientific study 
but propagandistic efforts at arousing emotion. 
If this is so, it should be clearly labelled as 
such, and evaluated as such. Even as 
propaganda, although it is powerful, it does 
not come up to John Hersey’s The W. all. 


W. Grant Dahlstrom 
State University of Iowa 


New Books, Monographs, and Pamphlets 


“Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 


Department of Psychology, University of 


The psychology of teaching reading. Irving H. Anderson 
and Walter F. Dearborn. New York: The Ronald 
Press Co., 1952. Pp. 380. 

The public librarian. Alice I. Bryan. New York: 
Columbia University Press, 1952. Pp. 474. $6.00. 

Applied psychology. Abridged edition. Harold E. 
Burtt. New York: Prentice-Hall, Inc., 1952. Pp. 
480. 

Practical applications of democratic administration. 
Clyde M. Campbell, editor. New York: Harper and 
Brothers, 1952. Pp. 325. $3.00. 

Factor analysis. Raymond B. Cattell. 
Harper and Brothers, 1952. Pp. 462. $6.00. 

Counseling in catholic life and education. Charles A. 
Curran. New York: Macmillan Co., 1952. Pp. 462. 

Proceedings of the international workshop on guidance. 
Mitchell Dreese ef al. Germany: Office of Public 
Affairs, Education and Cultural Relations Division, 
Education Branch, Office of the United States High 
Commissioner for Germany, 1951. Pp. 171. 

A basic text for guidance workers. Clifford E. Erickson, 
editor. New York: Prentice-Hall, Inc., 1952. $4.75. 


New York: 


The counseling interview. Clifford E. Erickson. New 
York: Prentice-Hall, Inc., 1952. $2.50. 
Social psychology. Robert E. L. Faris. New York: 


The Ronald Press Co., 1952. Pp, 420. 


Understanding heredity. Richard B. Goldschmidt. 
New York: John Wiley and Sons, Inc., 1952. Pp. 
228. $3.75. 

Psychology in industry. J. Stanley Gray. New York: 
McGraw-Hill Book Co., Inc., 1952. Pp.401. $5.00. 

Measurements of human behavior. Revised edition. 


Edward B. Greene. New York: The Od 3 
Inc., 1952. $4.75 e Odyssey Press, 


Growing through play. Ruth E, 
Columbia University Press, 


Hartley. New York: 
1952. Pp. 62. $.75, 


284 


Minnesota, Minneapolis 14, Minnesota. 


New play experiences for children. Ruth E. Hartley, 
Lawrence K. Frank, and Robert M. Goldenson. 
New York: Columbia University Press, 1952. PP- 
66. $.75. ; 

Readings in indusirial and business psychology. Harry 
W. Karn and B. von Haller Gilmer. New York: 
McGraw-Hill Book Co., Inc., 1952. Pp. 478. $3.50, 
paper; $4.50, cloth. r 

Increasing personal efficiency. Donald A. Laird. 
Fourth edition. New York: Harper and Brothers, 
1952. Pp. 294. $3.50. A 

Incentive management. James F. Lincoln. Cleveland: 
Lincoln Electric Co., 1951. Pp. 280. $1.00. 

The yearbook of psychoanalysis, Sandor Lorand, mav- 
aging editor. New York: International Universities 
Press, Inc., 1952. Pp, 271. $7.50. 

Marketing research. David Luck and Hugh Wales: 
New York: Prentice-Hall, Inc., 1952. Pp. 480. 

Art and technics. Lewis Mumford. New York: C0 
lumbia University Press, 1952, Pp. 162, $2.5 d 

Lectures and conferences on mathematical statistics aa 
probability. Jerzy Neyman. Second edition. Was Al 
ington, D. C.: Graduate School, United States D° 
partment of Agriculture, 1952, Pp. 274. 

Play, dreams and imitation in childhood. Jean 
New York: W. W. Norton and Co., Inc- 
Pp. 296. $5.00. er- 

Lines in progress, a study of the natural gr owth of 
sonality. Robert W. White. New York: Wil! 

_ Sloane Associates, Inc., 1952. Pp. 320. $2.75. 1f. 

The hand in psychological diagnosis. Charlotte W 
PEA York: Philosophical Library, 1952. PP- 

Personality and problems of adjustment. Kimball ie: 
Second edition. New York: Appleton-Ce? 
Crofts, Inc., 1952. Pp. 294, $5.00. 


~« 


: Journal of Applied Psychology 


VoL. 36, No. 5 


OCTOBER, 1952 


An Opinion Survey of a Regional Union Group 


Keith Davis! 


Indiana University 


and 


Edward E. St. Germain 


San Francisco Port of Embarkation 


Although employee opinion surveys are 
dely accepted and used in the United States 
a ay? survey variations have been generally 
omitted by those making them. One variation 
Seldom used is to have the union organization 
Sponsor the survey and distribute the survey 
fMestionnaire. Another variation is to have 
aC union administration use the survey to 
termine its members’ attitudes toward its 
etectiveness in doing its own job, as well as 
termine their attitudes toward their em- 
Plover, This article reports the results of an 
*Pinion Survey of 140 members of a midwestern 
union, early in 1951, using the variations 
described, 
f one excludes opinion surveys of age 
. Public reporting about employers an 
nions “in general,” one finds that the usua 
Y practice is for the employer to admin- 
€ questionnaire to his employees or to 
S è consultant administer it. Unions have 
‘a times “cooperated with” an employee 
n poll or occasionally co-sponsored Fs 
> Ut generally they have not independently 
Ponsore, Opinion surveys of their members. 
t the time this survey was made, the research 


1 
That. Davis is Associate Professor of Managemen 
tion aa University, and Mr, St. Germain ‘is Orga 
X cthods' Examiner, San Francisco rene 
tkation, This research wus completed an i? a 
TS Were Associate Professor of Todustrial Rete maby 
erag arch Associate, respectively, at the Uni 
2 e 
o` SXample, a recent report states that 35 per e 
“™ployers have used employee opinion aur H 
at over 5i per cent of those not nene Ts 
them of some value. See Shurtleff, W. io 
“ment listening? Personnel, 1951, 27, 101-102. 


wi 


or 


iste 


departments of both C I O and A F of L 
national headquarters informed the authors 
that they had no knowledge of any union 
having conducted a formal poll of its members’ 
attitudes toward company management prac- 
tices. Use of the union as survey sponsor 
probably gives the employee added confidence 
that his comments about management will not 
be used against him by management. This 
should encourage him to give more accurate 
responses. On the other hand, it is conceivable 
that he might attach to the survey a union- 
centered antagonism toward management. 
This would cause a union-sponsored. survey to 
be biased against management, 

If a union sponsors an attitude survey about 
management, it seems appropriate to use the 
same survey to determine ho 
feel about its own administration. Attitude 


The union, 
management 


sometimes in a different way. 
the same human raw 
management does. 
obligations that other mana: 


Description of the Group 


The area of the Survey was an A F of L 
union local situated in a midwestern state, 
The local headquarters wasin an industria] city 


285 


286 


cf over 100,000 population which had a climate 
considered “favorable” to unionism. The 1,200 
members operated many varieties of heavy 
machinery used in commercial construction 
and road building, such as draglines, bulldozers, 
concrete mixers, pile drivers, cranes, and 
tractors. The membership at the time of the 
survey was estimated to be 60 per cent skilled, 
30 per cent semi-skilled, and 10 per cent un- 
skilled, but it varied somewhat according to 
seasons and work demands. Hourly wage 
rates ranged from about $1.60 to $2.35. 

Since the union members work on construc- 
tion jobs, most of them work in small groups 
of two to twenty-five in different towns and 
tural locations throughout the state. Each 
member regularly works with various em- 
ployers over a period of time, remaining with 
one employer only as long as the type of 
equipment he operates is needed. Because 
of the wide geographical dispersion and move- 
ment of membership, the union maintains 
between five and ten area and branch offices 
in other cities. 

The key union administrative official is the 
full-time business agent, whose office is in the 
local headquarters. Employers list orders for 
operators with him, and he assigns operators— 
and removes them—according to his judgment. 
He is an elected officer and has been in office 
over ten years. 

The local members generally have more 
contact with the union than they do with any 
individual employer, because they work for 
different employers, but are always assigned 
and removed by the same union. Even on the 
job, the members are largely free of regular 
contact with their employer and free of close 
supervision by him. The union does not have 
a union shop in the industry, but the non-union 
group of similar workers is very small. 


Survey Method 


The business agent agreed to sponsor the 
survey, after the authors suggested it. The 
agent helped the authors prepare a three-page 
questionnaire of 28 questions. Each question 
was short, simple, and pitched to the reada- 
bility of the respondents, who answered by 
checking one of five responses on an attitude 
scale. One of the available responses was 


Keith Davis and Edward E. St. Germain 


“undecided.” The questionnaire heading was 
“Wanted: Your Opinion of Your Union and 
Your Job.” The written directions were a 
part of the survey; they assured the employee 
that the survey was anonymous and that the 
results would be available to him through the 
union. Prior to answering the survey, each 
employee answered general questions about 
such subjects as his age group and marital 
status. 

The questionnaire was pretested on a few 
union workers not in this union. Revised 
questionnaires (301 in number) were then sent 
to the business agent for distribution. A few 
days previously he had notified the member- 
ship that the survey was pending. Since the 
potential respondents were working in various 
locations over the state, the agent distribute 
part of the questionnaires personally during 
his regular tour to different jobs. The remain- 
der of the questionnaires were distribute 
through members of the Executive Board who 
met at headquarters during the survey perloc 
Reliability of the returns was checked by the 
cumulative frequency method,’ which show® 
no important difference in replies resultin’ 
from the two methods of distribution. 

Replies received numbered 151 out of the 
301 sent. Of those received, 11 were exclu jec 
because of incompleteness, leaving 140 replies 
for analysis. These 140 replies represente 
47 per cent of questionnaires distributed an 
about 12 per cent of all dues-paying membe! 
Replies were received from about 45 pet on 
of the active job locations in the state. = 
spondents readily admitted they had opinion 
because of the fact that the “undecided” cho! 
received no more than five per cent of respo”! 
to any question. 


Results 


Attitudes Toward Union Administrative P: m y 
tices and Problems. Union members gener? C 
approved of their local’s administrative pi at 
tices because their “favorable” responses 
no time dropped below 70 per cent of ost 
responses (see Table 1). However, in "ely 
instances a greater proportion was “moder phe 
favorable” rather than “very favorable.” 
weakest part of union administration, í 


aon 


a 
3L. O. Brown. Marketing and distribution 7° 
The Ronald Press Company. 1949, pp- 524-528- 


IN 


Opinion Survey of Regional Union Group 287 
Table 1 
ws Ini ini iv i Problems, Arranged in 
ion N f Selected Union Administrative Practices and b i 
ooe sa“ Order of Favorable Responses Based on 140 Replies 
Per Cent of Responses 
Second Second T 
l Most Most Most „Most 
7 o al oe Favor- Favor- Unfavor- Unfavor- 
ble able able able able able Unde- 
Subject of Opinion Responses Responses Response Response Response Response cided 
3 2 6 3 1 
Worthwhileness of union meetings 90 9 48 4 
ikelihood that present contract 
Will be renewed without major A 
chang : 2 es e s id 
ministration of grievances a ed 37 13 3 i 
ee to attend union meetings 83 = 
Tobability that members of the 
ocal personally support same A 
Political candidates 80 au = 3k 19 4 ï 
ocal’, z r 
seal strength in a z 24 53 iz 7 1 
Ow well respondent Il 
ent personally 58 23 
pat aterstands labor contract 77 23 B 9 0 
au to which local keeps its 
-me n i 
ag informed on union j8 21 22 54 14 7 3 
Rx N 
y to which union members 
Beneral are in the “right” 
ie they call a strike ý E li a 2 = A s A 
ane Present labor contract k 26 24 46 16 10 4 


arness in assigning jobs 


x , 
™emberg? viewpoints, was the business agent’s 
@rmegs 3 , 


felt ess in assigning jobs. Only 70 per os 
Stro, that he did not show favoritism. he 
Co i8%t part of union administration ee r 
t nduct of meetings. Ninety per cent i 
Meetings were worth the time spent, an 
u Per cent desired to attend union ae 
P” or oftener. It should be noted t a 
ei than 83 per cent actually an ea 
Union je because most meetings were saw 
à con, headquarters and members had to tra 
(0) iderable distance to them. a" 
pute Per cent of the members felt t 2 t 
but an argaining contract was “excel ent, 
temajn eet 53 e t rated it “good.” The 
in Femi inion of the 
ng 27 per cent had a low opinion 
ore, Which indicates that more than one 
Our members will want contract im- 
ents at the next contract period. In 
Per cent believed that the contract 
‘be renewed again without major 


changes. It is interesting that the proportion 
who thought the labor contract was excellent 
(20 per cent) was about the same as the pro- 
portion who thought they understood all of 
the contract.* This suggests a relation þe- 
tween degree of understanding of the contract 
and satisfaction with it. Further, 23 per cent 
had “little” understanding of the contract and 
this indicates that much more contract com- 
munication needs to take place. Twenty-one 
per cent felt that they were not wel] informed 
on union matters in general. 

A significant factor in the employees’ over-all 
viewpoints may be that 77 per cent believed 
their union was “usually” or « 


always” stronger 
than the employers at the barg: 


aining table. In 


* This suggests the desirability of simplified language 


in union contracts. See Lauer, Jeanne, and Paterson, 
D. G. Readability of wie ‘ ; 


o contracts. Personnel 
1951, 28, 36-40; and Tiffin, J., and Walsh, F.X. Read? 
ability of union-management agreements. Personnel 


Psychol., 1951, 4, 327-338, 


288 


other words; they could “afford” to want more 
at the bargaining table, because they had the 
strength to get it. There is some logical basis 
for their opinions, since the employer group 
consisted of many independent employers, but 
the union group operated as one organized 
unit. Although the union and management 
have worked together without strike for over 
a decade, the union’s confidence in its power 
would undoubtedly affect its conduct during 
a severe strike. From another viewpoint, 
employer conduct at each bargaining session 
may be influenced by the fact that union mem- 
bers have so much confidence in their own 
strength. 

Eighty-four per cent of the respondents 
agreed that the union’s handling of grievances 
was satisfactory, but only 32 per cent would go 
so far as to say that grievances were “always” 
handled satisfactorily. Inasmuch as manage- 
ment and the union have amicably settled 
grievances without resort to strike during the 
last ten years, one might wonder why more 
members were not strongly satisfied with 
grievance administration. One explanation is 
that grievance settlements usually involve com- 
promise, which at best leaves the employee 
moderately satisfied, but not always strongly 
satisfied. 

Most members (eighty per cent) believed 
that they usually supported the same political 
candidates, which indicates that their leader- 
ship has successfully created within their union 
environment a feeling of political solidarity. 
As a corollary question, members were asked 
if they thought labor unions have a strong 
influence in politics, and 84 per cent responded 
affirmatively. Furthermore, 76 per cent felt 
that unions belonged in politics.® It seems 
clear that the respondents favored union politi- 
cal action, felt that their associates were coop- 
erating to support the same candidates, and 
felt that they strongly influenced election 

results. 

A general question was asked: “How often 

are labor unions in the right when they call a 
strike?” Forty-six per cent checked “usually” 


5In a national survey in November, 1947, only 59 
per cent of the respondents believed unions should 
participate in politics. See What the factory worker 
really thinks. Factory Mgmt. & Maint., November, 
1947, p. 93. 


Keith Davis and Edward E. St. Germain 


and thirty per cent checked “always” which 
indicates a high degree of belief in their cause 
and in the strike as a device of coercion. 

Attitudes Toward Employer Adminisirative 
Practices and Working Conditions. Attitudes 
toward the employer group were not as con- 
sistent as those toward the union, but they 
were over-all about as favorable. Favorable 
responses did not drop below 66 per cent for any 
subject (see Table 2). The highest proportion 
of favorable responses during the entire survey 
resulted from the question, “How often do the 
contractors interfere with your job perform- 
ance?” Ninety-three per cent of the respond- 
ents believed that employers usually did not 
interfere with job performance. This propo! 
tion speaks well of employer supervision a” 
employee ability to do the job without clos¢ 
supervision. 

Respondents were moderately satisfied with 
their working conditions. Seventeen per cent 
of the 140 respondents felt that their job W25 
never monotonous, and another 71 per cent 
felt that their job was only occasionally monot- 
onous. The respondents’ jobs are generally 
viewed as non-monotonous by their employe! 
who point out that they work independently 
or in small groups and are to a considerabl® 
extent “their own boss” on the job, Anothet 
factor relieving monotony is the constant 
change of employers. Seventy-one per cen 
approved their present practice by saying that 
they did not desire to work for the same ©” 
ployer regularly. t 

Seventy-five per cent believed that mos 
employers furnished good tools and equipmey 
but 24 per cent thought few employers di 4 
Although equipment in this industry is eP% 
sive and recently has been difficult to purch@®” 
employers should through good communicat! 
and equipment maintenance be able to re i 
this strong minority that is dissatisfied wi i 
equipment. Respondents rated their over 
working conditions almost identically wg 
their rating of equipment. Seventy-s* Fy 
cent believed conditions were good or excelle” 3 
but a strong minority of 21 per cent thous: A 
working conditions were only “fair.” Cone 
ering the fact that much of their work is rouy 
dangerous, outside work away from home © A 
different employers, the proportion of sat! a 


Opinion Survey of Regional Union Group 289 
Table 2 
Union Member Opinions of Selected Employer Administrative Practices and Job Working Conditions, 
Arranged in Descending Order of Favorable Responses Based on 140 Replies 
Per Cent of Responses 
Second Second 
Total Most Most Most Most 
sed afivor: Favor- Favor- Unfavor- Unfavor- 
able able able able able able Unde- 


Subject of Opinion 


Responses Responses 


Response Response Response Response cided 


Extent to which employer does 
not interfere with job perform- 


pace 93 7 
pedom of job from monotony 88 12 

Sree of employer cooperation 

With local 81 18 

ver-all opinion of present job 

Compared to factory work at 

Same pay 80 18 

Yer-all appraisal of working con- 

ditions £ 76 23 

tent to which good tools and _ z 

ycttipment are provided 75 z 
vi raullattitude toward employers 69 2 

arnt Wage scale is as high as 

it should “ai cale is as hig ie z 

penal appraisal of morale of 3i 

other local members 66 

proto Work for same employer ” 

Continuous}y 26 


29 64 6 1 0 
17 71 8 4 0 
12 69 14 4 1 
64 16 12 6 2 
9 67 21 2 1 
9 66 24 1 0 
12 57 24 5 2 
7 60 23 5 5 
12 54 23 8 3 
6 21 55 16 2 


tion with 
reasonable : i 
wag Ployees were least satisfied me = 
the Twenty-eight per cent por sal is as 
hi Statement, “Your present wage scale 1S 
as it should be.” A negative gi 
Produce ontroversial statement might a 
Md different results. dents 
haq ore than a majority of the eat ers 
an favorable over-all opinion of emp oy t 
elieved they were positively trying o 
rege tate with the union. These ee 
Te a decade of harmonious ere wt 
ie employ. er and employee. Eighty-one a 
a elieved that employers usually try 4 
Tate with their union. ane Bal 
Iio ed their employers “good” or nabs 
haq ,,° employers. Since each respo 
tts ed for many emi loyers, the sun a 
el y p include 
mos, TeSPondents? judgments should include 
St every employer in the group- m 


y wes to be 
working conditions seems 


ployees’ appraisal of their employers was 
almost identical to their appraisal of the morale 
of other employees, which indicates that they 
thought others felt just as they did about their 
mutual jobs and bosses. 

The problem of morale and working condj- 
tions was approached from another point by 
asking employees, “Would you change places 
with a factory worker at the same pay?” 
Sixty-four per cent said “No” and another 
16 per cent said “Unlikely,” which is further 
evidence of their general job satisfaction, 


Summary 


It is practical and useful for a union organiza- 
tion to sponsor an employee attitude survey, 
in the same way that 


attitude surveys by 
employers have proven useful. Furthermore, 


it seems appropriate that the survey be directed 
toward union practices as well as employer 
practices. Both union and employer share 


290 


authority in determining the policy under 
which the worker works. Both are in frequent 
contact with him and affect his work situation. 
There is no reason to study his attitude toward 
only the employer. 

Both the union and the employers developed 
an early interest in the survey reported herein, 
even though only the union sponsored it. 
Both groups felt that its results would help 
them understand the strengths and weaknesses 
of their relationship. 

The survey showed that employees were 
reasonably satisfied with their union and their 
employers, which reflected many years of 
cooperation. However, an important minor- 
ity ranging from 21 to 27 per cent were not 
satisfied with the following union practices: 
fairness in assigning jobs, provisions of the 
present labor contract, and the local’s com- 
munications on union matters, including the 
labor contract. Members were moderately 
satisfied with grievance administration, only 
11 per cent dissenting. They also were 
strongly satisfied with union meetings. Fur- 
thermore, three-fourths (77 per cent) believed 
they were stronger than employers at the 
bargaining table. They also favored union 
political action (76 per cent), felt that their 
associates were cooperating to support the 


Keith Davis and Edward E. St. Germain 


same political candidates (80 per cent), and 
felt they strongly influenced election results 
(84 per cent). 

Most employees were satisfied with their 
job freedom (93 per cent), lack of job monotony 
(88 per cent), degree of employer cooperation 
with their local (81 per cent), desirability of 
their job compared to a factory job (80 per 
cent), and over-all working conditions (76 pet 
cent). However, a strong minority felt that 
the attitudes of other employees toward em- 
ployers were only fair or poor (31 per cent); 
which indicates a need for better understanding 
between employees. A sizeable minority also 
were dissatisfied with wages (28 per cent), 
with the necessity to work for different em- 
ployers (27 per cent), and with the quality of 
tools and equipment furnished by the employe 
(25 per cent). 

The main dissatisfactions uncovered in this 
survey were in the areas of communication 
and practical day-to-day problems, rather than 
in areas producing a conflict of principle. Th® 
day-to-day problems can be resolved or reduced 
within the existing pattern of employer-unio? 
cooperation, aided by such techniques as unio?” 
sponsored and management-sponsored attitu g 
surveys. 


Received November 26, 1951. 


‘Attitudes of Personnel Managers and Student Groups 
toward Labor Relations 


Raymond E. Bernberg 
Los Angeles State College 


In pursuing a college course of study in 
Preparation for a career in personnel work in 
Industry, a student may choose between two 
Possible major studies: Industrial Psychology 
(in the Department of Psychology), Personnel 
Management (in the Department of Business). 
_ It seemed to the author that a dichotomy 
in attitudes and values exists between these 
two groups of students, both of whom are 
Preparing for the same career. This suggested 
three problems for investigation: 1) Can this 
Apparent difference in attitudes and values 

© defined by psychological tools? 2) Does 
à similar dichotomy exist among personnel 
Managers? 3) Which student group is more 
Similar in attitudes and values to personnel 
managers? 


Procedure 


The All dy of Values (1) and 
port-Vernon Study of ' 

the Labor-Relations Information ee 
The 

rela- 


tives: 


w Were selected as instruments to me 
the differences between the groups: 
ae ort-Vernon scale of values meni 
Strength ix basic interests or motly 
leortical, at ents social, political, 
and religious, The Labor-Relations Inventory 
Mdirectly assesses attitudes toward labor. i 
1 € tests were administered to 51 Industria 
“Ychology majors who were upper-division 
ae Staduate students from colleges and uni- 
“rsities in the Los Angeles area and from two 
LVersities in the middle-west; 61 Personnel 
gement majors from the same sources; 
Ss Personnel and Employment Managers 
ea Los Angeles area. The subjects 
ttute an approximate 30% of those to 


To reduce printi Labor-Relations Infor- 
Dorion Inventory hae been deposited with the American 
Amer ctation Institute. Order Document NT Street 
N. y can Documentation Institute, 1719 nao 

m 4 Washington 6, D. C., remitting $1.00 notion- 
Picture Bes 1 inch high on standard gomm inches) 
žeadabl film) or $1.00 for photocopies (6 X 

© without optical aid. 


whom materials were given. The whole sample 
is dominantly male with but a few female 
subjects throughout. 


Results 


The results of an analysis of the data are 
contained in Table 1. The students as a group 
have significantly higher mean scores in their 
attitude toward labor than those in the field. 
According to the norms of the test, the student 
means lie in the “neutral zone” of attitude 
toward labor, while those in the business world 
have a mean score which is in the “‘pro-manage- 
ment zone” of attitude toward labor. The 
employed group also have a narrower range of 
scores, indicating greater homogeneity. Both 
student groups have broader ranges, reflecting 
their heterogeneity with respect to this atti- 
tude. 

On the Scale of Values test, Industrial 
Psychology majors have significantly higher 
mean scores for the theoretical and the aesthetic 
values than either of the other two groups. 
They have significantly lower mean scores than 
the other two groups on the economic and on 
the religious values. The Personnel Manage- 
ment students and the Personnel and Employ- 
ment Managers show similar means on all of 
the values. 

In interpreting these data, one should rec- 
ognize the great differences in training between 
the Industrial Psychology and Personnel Man- 
agement students. The high value of the one 
for the theoretical opposed to the low value for 
the economic may be an indication of the 
emphasis in curriculum and course material 
in psychology, even in the applied area. 
Nevertheless, two possibilities suggest them- 
selves: (1) students electing the Industrial 
Psychology major may have entered college 
with a more liberal attitude toward labor and 
a more theoretical orientation; (2) the differ- 
énces appearing between the two groups may 


291 


292 


Raymond E. Bernberg 


Table 1 


Mean Scores, Standard Deviations, and Tests of Significance for Students Majoring in Industrial Psychology, 


Students Majoring in Personnel 


Management and for Employed Personnel Managers on the 


Allport-Vernon Scale of Values Test and the Indirect Test of Attitude Toward Labor 


Allport-Vernon Scale of Values 


Labor 
Relations N ia E 
Group* Inventory Theoretical Economic Aesthetic Social Political Religious 
A Mean 13.56 35.6 30.6 27.9 30.8 30.8 a 
S.D. 4.70 7.58 7.07 8.39 6.37 6.27 7. 
B Mean 12.93 29.6 35.1 22.7 30.0 31.9 a 
S.D. 4.68 6.18 7.11 6.29 5.27 5.40 8. 
C Mean 9.26 30.5 33.7 22.2 30.6 32.6 son 
S.D. 4.44 5.86 6.32 7.06 4.27 4.44 8.5 
cR PE GR P RP Ghp CHP CR. PP CR. pa 
A-B ‘71 48 460.001 342.001 3.75 001 73°46 110 27 452 oo 
A-C 453 00L 3.65 001 2,24 025 3.56 001 `i 86 163 11 3.24.0 : 
B-C 4.06 .001 74 46 1.07 .29 36 .72 63.53 .71 48 76 A 


* Group A—Industrial Psychology Majors (N=51 
C—Personnel and Employment Managers (N=43), 


result from differences in expressed attitudes 
of their instructors within the different cur- 
ricula (2, 3), 

In other words, are attitudes and values such 
as we are discussing engendered by training 
in appropriate curricula, or is basic motivation 
reflected in these attitudes and values and 
manifested in the choice of the different courses 
of study? It is quite possible, of course, for 
these two factors to coexist. 

If it is true that the attitudes and values are 

table and resistant to change, it seems likely 
that most students in the Industria] Psychology 
major face an experience of frustration, through 
inability to accept the attitudes and values of 
those under whose executive direction they 
will work. We do not know the answer to 


); Group B—Personnel Management Majors (N=61); GrouP 


this problem, but it is evident that the answa 
is needed to give us a better basis for vocationa 
guidance in this area. 


Received November 24, 1951. 


References 


1. Allport, G. W., and Vernon, P. E. A test for pet 
sonal values. J. abnorm. soc. Psychol., 1933 
26, 231-248. Po 
2. Seashore, H. G. Validation of the Study of Valu ; 
for two vocational groups at the college leve 
Educ. psychol. Measmt, 1947, 31, 249-259. rk: 
3. Stagner, R. Psychology of personality. New Yo! 
McGraw-Hill Co., 1949, ard 
4. Weschler, I.R. An investigation of attitudes ow 
labor and management by means of the eao. 
choice method. J. soc. Psychol., 1950, 32, 5t 


Empathy Test Scores of Union Leaders 


Raymond H. Van Zelst 


Illinois Institute of Technology 


A careful survey of recent psychological 
Iterature reveals numerous articles relating 
toora component of the concept of empathy. 

owever, in extricating the kernel of empathy 
from the abundance of associative material 
one becomes aware of a rather marked lack of 
Concerted effort dealing with the direct meas- 
rement of this important talent. In no way 
S this ablation of empathy meant to deter 
Tom the important, and, in fact, corporate 
actors (e.g, sympathy, insight, etc.) contrib- 
Uting to the concept of empathy. Rather, the 
author herein seeks to adopt an isolative view 
of the Problem by dealing with a concept which 
a be most appropriately entitled massemp- 


The Particular trait known as empathy has 
n defined by various sources (5, 9, 10, 11, 

n$ as the “ability to put oneself in the 
oa Person’s position, establish rapport, and 
iouipate his reactions, feelings, ee he 
th i he concept of massempathy or W à 

a 'nstrument used in this study is designe 
e said to connote the ability of an 
to “feel into” the average or 
Ee pothetically average person’s me = 
being AD: This trait may be seen t mee 
hot 6 Mpathic ability plus, since one a 
Struc Y “feel into” or interject oneself in ae 
also tural field of another specific person, bg 
flelg “ssume the “average” person's eee = 
ven, The difference between our of ; es 
situate” Or cultural expectation of actio. aks 
indic on and the actual conduct of the pereon 
tion ates the presence of a private irer a6 
Ofici ere the private version equals 

sil one we have normal conduct” (4). ms 

Sent 2 article; therefore, will ee = m 
*eliabinn ence pertaining to the nage: 
to asses” Of a recently devised test desig 

F empathic or massempathic ability. 


Description of the Test 


Ih pea of 
mpag Ord with the above definition 
i it was the belief of the authors of 


the Empathy Test that “individuals who are 
superior in empathic ability are persons who 
are above average in understanding and antici- 
pating the reactions of other people” (8). This 
belief was utilized in the construction of the 
test. 

The Empathy Test is composed of three 
parts. Part I consists of various types of 
music (e.g., polkas, classical, etc.) and the 
subject is requested to “rank these types in 
order of their probable popularity among the 
non-office factory workers of the United 
States.” The types of music used are based 
upon the advice of an RCA Victor phonograph 
records sales executive and the normative data 
supporting this section are the results of a 
national survey program. 

Part II contains the titles of 15 current well- 
known magazines and the testee is required to 
rank them “in order from most to least total 
paid circulation.” Actual rankings were ob- 
tained from published actual sales of the 
various publications. 

Part III has the subject rank ten commonly 
annoying experiences according to their specific 
annoyance to the general population. Norma- 
tive reports are based upon the extensive 
findings of Hulsey Cason (2), 

Each section, therefore, requires the indj- 
vidual to define specific normative reactions 
and so is in keeping with the proposed definition 
of empathy. Scoring of the test is objective 
and the amount of time needed for administra- 
tion short (15 min.). 


Subjects and Criteria 


Subjects used in this validation of the 
Empathy Test are 64 business agents—a total 
sample of five A F of L building trades unions 
situated in the Chicago area. The main duties 
of these men are to: (1) enforce union rules; 
(2) recruit and organize new members; (3) 
settle grievances and disputes; and (4) handle 
all benefit payments. All of these duties may 
be considered as Constituting activities which 


293 


294 


require definite leadership abilities such as 
tact, justice, sound judgment, determination, 
responsibility, etc. for efficient performance (3). 

Six criteria were selected as. gauges for the 
leadership ability of these business representa- 
tives. The first is a ranking of these men as 
to “Success at present leadership level and 
probability of leadership ability at a higher 
leadership level” by their immediate district 
superiors. Each of the five groups was ranked 
by four judges, the district president, vice- 
president, secretary, and treasurer of the par- 

ticular union, working independently. All of 
the judges were well acquainted with each 
individual’s record as a union official. Relia- 
bilities of the rankings as determined by the 
method of average intercorrelation (7) were 
.92, .91, .91, .89, and .88. 

The second criterion utilized was voting 
record, that is, the percentage of the total 
number of votes cast which the individual 
received in the most recent local election. The 
validity of this criterion may be contaminated 
by the caliber of the opposition. 

The third criterion was score on the File- 
Remmers How Supervise? test (Form M), a 
reliable and validated measure of supervisory 
knowledge and insight (1). 

The fourth criterion used was the recruitment 
and organization of new members. The main 
contamination present here is the fertility of 
the area to which the operations of the business 
agents are limited. This was controlled by 
taking per cent increase over previous year’s 

local size. 

The fifth criterion employed was the ability 
to settle grievances and disputes. Complete 
records are kept by the district offices of all 
such occurrences which require the attention 
of the local business agent. These records 
were taken individually and carefully evaluated 
in detail by the district president, vice-presi- 
dent, secretary, and treasurer, working indi- 
vidually, and assigned a point rating of 0, 3, 6, 
9, or 12 according to estimated successfulness 
of handling. The mean accumulated score 
was used as the index of proficiency. 

The sixth criterion utilized was the enforce- 
ment of rules and regulations. Violations of 
union working rules other than grievances and 
disputes which were reported by sources other 
than the local unions’ business representative 


Raymond H, Van Zelst 


were used as an index of laxity. In order to 
minimize possible errors due to the opportunity 
for such violations to arise, the number of 
violations reported was divided by the namber 
of jobs in progress in the area of the local 
representative's jurisdiction. 

The author then endeavored to ascertain the 
success of these business agents in this complex 
activity by singling out these six applications 
of leadership ability and measuring them. 
These six aspects are in the opinion of the 
cooperating district officials the keystones of 
the entire activity of these union business 
agents. Furthermore, these significant criteria 
are readily observed and accurately measured. 


Results 


Pearsonian correlation coefficients were com- 
puted between Empathy Test score and the s!X 
criteria. The business representatives’ score 
on the Empathy Test correlated .67 with leader- 
ship rank, .38 with percentage of vote receive 
in local union election, .55 with How Supervise? 
score, .60 with recruitment and organization 
of new members, .64 with the ability to settle 
grievances and disputes and .44 with enforce 
ment of rules and regulations, The multiple 
R between Empathy Test score and the six cri- 
teria is .76. All coefficients are statistically S18” 
nificant at the one per cent level of confidence 

Reliability of the Empathy Test was ascet- 
tained by means of the odd-even technique: 
Odd items on each subtest were utilized as 02° 
score while the even numbered items const 
tuted the other test score. Corrected reli” 
bility was .87, 

The substantial coefficients arrived at b& 
tween test score and the six criteria of leadet™ 
ship ability seem to suggest that the Empalliy. 
Test may be profitably employed in the pret” 
tion and selection of potential union leaders: 
It is evident, however, that the findings 2° 
to be cross-validated before they may 
accepted as fully established, 


Received October 17, 1951. 


References 


1. Buros, O. K. The third mental measurements your 
book. New Brunswick: Rutgers Press, 19 

2. Cason, H. Common annoyances—a psycholog 
study of everyday aversions and irritat! 


‘cal 
no 


18: 

3. Coffin, T. E. A three component theory of leader- 
_ ship. J. of abnorm. soc. Psychol., 1944, 39, 63-83. 

X 4. Dollard, J. Criteria for the life history. New 

_. _. Haven: Yale Press, 1935. 

k Dymond, R.F. A preliminary investigation of the 

relation of insight and empathy. J. consult. 

Psychol., 1948, 12, 228-233. 

ymond, R. F. A scale for the measurement of 

empathic ability. J. consult. Psychol., 1949, 13, 

z 127-133. . 

16 Guilford, J-P. Psychometric methods. New York: 
McGraw-Hill, 1937. 

» Kerr, W., A., and Speroff, B. J. Measurement of 

empathy. Box 1625, Chicago 90: Psychometric 

Affiliates, 1951. R 

* Miller, F. G., and Remmers, H. H. Studies in 


Psychol. Monogr., 1930, 40, No. 2 (Whole No. 
2). 


Empathy Test Scores of Union Leaders 25 


x 


industrial empathy: II. Management’s attitude 
toward industrial supervisors and their estimates 
of labor’s attitude. Personnel Psychol., 1950, 4, 
33-40. 

10. Remmers, L., and Remmers, H. H. Studies in 
industrial empathy: I. Labor leaders attitude 


toward industrial supervisors and their estimates _ 


of labor’s attitude. Personnel Psychol., 1949, 
3, 427-436. 

11. Speroff, B. J. Evaluation of a mass empathy meas- 
urement on male college students. 
at Midwest. Psychological Association meeting,» 
Chicago, 1951. 

12. Van Zelst, R. H. Validation evidence on the em. 
pathy test. Educ. psychol. Measmt, in pres 

13. Warren, H. C. Dictionary of psychology. Bos 
Houghton Mifflin Co., 1934. 


Paper delivered ` 


An Appraisal of Worker Characteristics as Related to Age* 


William H. Bowers 
The Ohio State University 


Since 1900 the number of persons age 65 
and over has grown from about 3 million to 
nearly 12, and those between 45 and 64 years 
of age from nearly 10} to 31 million (6). 
However, age discriminations in employment 
limit the potential productivity of many older 
individuals. Increases in mandatory pension 
plans may operate to reduce still further the 
work opportunities available to older persons. 
Parran estimates that 1,500,000 persons are 
prematurely retired and that as a result 
$4,500,000,000 a year is lost in productive 
earning capacity (3). Life has been made 
longer, without provision for extending pro- 
ductivity. However, these issues are not being 
ignored. The National Conference on Ageing 
identified numerous problems of oldsters (1), 
the Desmond committee of the New York 
Legislature has stressed especially employment 
problems in the older years (2), a recent issue 
of the Annals of the American Academy of 
Political and Social Science deals with oldsters 
in the community and in industry (5), and 
continuing studies by the United States Em- 
ployment Service concern the employability 
of older persons. These and other reports 
emphasize the undesirability of prevailing 
practices regarding hiring and retirement in 
relation to age, and suggest various means by 
which even a person somewhat handicapped 
by age may be kept on the job, as by transfer 
to less exacting duties, part-time employment, 
or establishment of departments specifically for 
older workers (4, 7). 

A major research need is clearly for direct 
and practical appraisals of the competence of 
older as compared to younger workers in a 
variety of actual working situations. Such 
findings could become basic in the formulation 
of retirement policies, job adjustment to facili- 
tate continued employment of older workers, 


* The writer was assisted in this investigation by 
Mark Weldon Smith, Research Fellow, The Ohio State 
University Development Fund. The project was under 
the direction of Dr. S. L. Pressey. 


selection of older persons for employment, and 
induction and training plans for older workers. 
In the absence of such findings, administrative 
actions affecting older workers will continue to 
be based on common impressions, half-truths, 
and special cases. The present investigation 
attempts a simple appraisal of the comparative 
performance of industrial workers of different 
age groups in a large organization, and a rough 
analysis of factors involved. 


Cases and Materials 


This study is concerned with 3,162 workers 
of both sexes, ranging in age from 18 through 
76 and performing various duties in an organi- 
zation that includes a variety of operations: 
And the workers covered ranged from foreme? 
and minor executives (more important execu- 
tives and supervisory personnel were not 
included in the cases studied), skilled crafts- 
men, operators of both heavy and light 
machinery and equipment, and inspectors o 
clerks, and to unskilled labor. In total it 
is believed that the variety of work and types 
of workers covered were as great as would be 
found in most plants with payrolls of four OF 
five thousand employees. The supervisory 
methods and records were judged rather better 
than in many industrial plants. The coope! 
tion of the management was excellent. 
data were obtained from the personnel records 
maintained for each employee, which include 
age, date of hiring, and appraisals by foreme? 
concerning competence. All this material W4% 
recorded in brief by the investigators on sum 
mary cards. Appraisals were made whe? 
administrative actions were taken affectiD§ 
employees, upon completion of training course 
asa basis in recognition for outstanding serv ic 
in connection with a grievance of an employe 
and more generally where some record seem? 4 
appropriate. For many employees, severa 2 n 
praisals were found, made by various sup?” 
visors. All of these were considered. 


296 


An Appraisal of Worker Characterisiics as Related to Age 


The appraisals consisted of highly informal 
Statements made by the foremen and covered, 
for one worker or another, approximately 300 
terms covering a wide range of abilities, char- 
acteristics and traits. It is felt that the evalua- 
tions were of especial descriptive worth, since 
they represented unstructured statements made 

Y supervisors, in their own language, with 
Teference to each worker as he was observed 
Mm his working situation. The purpose of 

€se records was not primarily to determine 
Promotions or other administrative actions 
Which so obviously and promptly affected the 
Worker as to tend to make the ratings biased, 
ut rather to provide meaningful appraisals 
for long-time policy, which could be referred 
to y the personnel department on occasions 
When a descriptive picture of the particular 
Worker was desired. , 
© reduce the number of terms in these 
aPpraisals to a workable list, those which were 
ambiguous or infrequently mentioned - were 
climinated or combined with others of similar 
Caning, after consultations with personnel 


297 


representatives and foremen and careful con- 
sideration by the writer and several associates, 
The 300 terms were finally reduced to a basic 
list of eight abilities, eight character traits, and 
four common faults which were felt to be largely 
unambiguous and fairly specific in the working 
situation. In handling the data, the first 
step was to find the numbers of persons in 
each age group (and of each sex) who were 
reported by foremen to have any of the twenty 
traits. Minus values were subtracted; thus, 
the number called inefficient was subtracted 
from the number called efficient. Then, net 
figures were converted into net percentages, 


Results 


Table 1 shows net percentages of men and 
women in each age group with over two years’ 
service—and thus presumably in the organiza- 
tion long enough to have been farily well 
appraised—who had been mentioned by fore- 
men as having each trait. Age groupings are 
in accordance with common census practice. 


Table 1 


Pi -o Years’ Service Mentioned by Foremen as Having Specified Traits 
F ercentages of Employees with Over Two Years y 


Ages of Men 


Ages of Women 


isz 30-44 45-59 60up 18-29 30-44 45-59 60.up 

BAicten uy E 4 40 40 41 37 41 48 

Job Kn 15 13 12 1 8 12 12 8 

Ability (ease A 7 2 3 u Ca 4 

caa Learn* 7 4 3 34 15 22 20 

ft 11 1 
“tative 6 7 5 5 5 6 4 
har 

Cos acter 8 25 24 24 19 29 24 

Depeetativeness# 2 Šo n % 2% 19 3 %4 

Att ability 27 10 13 15 3 5 12 24 

ia 3 3 4 6 5 10 E 

Steag eines 3 aC D 8 ng 2 28 
Co,  Nesg* 10 14 8 

nscientig 10 8 10 ie 6 17 20 

ushiease 
Tacs hee 15 11 11 ó 14 12 20 
Fault = 
Slowneset 5 1 -2 3 x 2 = 2 4 
Stabilit 7 2 3 2 4 6 12 4 
316 11 
Total 990 64 58 5 242 165 25 


* 
fol Traits Shows ; more be 
low; ‘S Showin, ce of 5 or n 
factio items were amen as showing no net Pity, 
'alkatip Ptmotion possibilities, supervisory apiko 
Cness. Thus, the highest mention of Phys 


tween the first two as compared to the last two age groups. The 
tage in any age group for either sex of 10 or more: job satis- 

work without supervision, 

difficulties was 6 for men 60 


physical difficulties and over- 
and over. 


298 William H. Bowers 


The data would seem to indicate that age 
differences in traits were relatively small. 
Older workers of both sexes with over two 
years’ service were reported to learn less readily 
and to be slower, but also more frequently 
to show good attendance, steadiness, and 
conscientiousness. These trends are quite con- 
sistent; thus women under 30 were fairly often 
mentioned as rapid workers (giving a net of 
—9 per cent slow), but the per cents drop 
steadily to 4 per cent slow over 60. In the 
other traits covered, age differences appear to 

_ be negligible, or not consistent. For example, 
the difference in efficiency between the oldest 
and youngest group of men is 9 net percentage 
points favoring the younger, but there is only 
a 1 point drop between the 30-44 and 60 up 
group. However, the oldest women are con- 


sidered more efficient than the younger by 
7 points. No consistent age differences appear 
in job knowledge, accuracy, dependability, 
emotional stability. As mentioned in the foot- 
note to Table 1, physical difficulties were not 
mentioned often enough even for the oldest 
group (6 per cent mentions for men over 60) 
to indicate that such handicaps were of 
importance. 

The preceding table deals with traits, not 
persons. For practical purposes it seemed 
desirable so to combine information on each 
person that a total appraisal of his competence 
could be obtained. A net evaluation of each 
worker was therefore obtained by summing 
favorable characterizations and subtracting 
the unfavorable ones in terms of the above 
mentioned traits, with special weighting for the 


Table 2 


Distributions of Net Appraisals of Competence for Men and Women with Over 
Two Years’ Service, and Men with Under Two Years’* 


Men Over 2 Years 


Net Women Over 2 Years Men Under 2 Years $ 
Appraisal 18-29 30-44 45-59 60up 18-29 30-44 45-59 60up 18-29 30-44 45-59 60uP 
13 1 1 
12 1 1 5 1 
11 . 4 2 1 1 1 1 
10 1 5 2 1 2 2 2 
9 G12 O 5 7 3 1 3 a 
8 Seems 7 5 5 8 5 3 1 3 
7 10ST 24 9 4 5 10 2 5 5 2 
6 18 "48 325 0 B3 g 7 2 8 on 
am. 2 F 2% 20 9 7 2 1 5 2 ua 
5 A 20m a27 & 22 3 2 2 4 15 3 
5 ae: Toe 21 16 3a Ww 3 3 400 t; 
F a SL rp 5$ 45 10 32 36 15 ; 
ò sU? 8 9 7 3 2 2n 8 
a 12 23 32 20 2 11 5 1 27 21 9 5 
E- 8 2 43 © 49 4 10 9 17 20 7 
z 6L m 7 6 1 8 2 16 7 3° 
E 5 12 5 6 3 4 3 6 
E A 3 2 1 3 1 i 2 3 1 
= $ 1 i 1 2 1 1 
—7 1 
-8 1 
Total el OR 82 162 129 17 228 as 17 % 
Mean 32 9-2 EES dwa Ze aa, ao 19 24 28 3° 
% Over3 50 47 42 40 52 46 47 53 , 3 s 42 
30 36 41 
% Under3 0 2 1 1 4 1 2 6 2 2 2 0 


* Figures in boldface indicate intervals in which the means fall. 


An Appraisal of Worker Characteristics as Related to Age 


important factor of efficiency (3 points were 
given if efficiency was stated to be outstanding, 
2 if very good, 1 if good or average, —1 i 
Poor, —2 if unsatisfactory, and —3 if dismissed 
or disciplinary action had been taken because 
of inefficiency). Table 2 shows distributions 
and means of these net evaluations in the 
Various age groups. Data are shown again 
for those with over two years’ service and also 
Or men with less time in the organization; 
Women with less than two years’ service were 
Omitted because few of them were in the older 
§roups, and many had not been appraised. 
he wide range of scores for both men and 
Women suggests that the evaluative procedure 
discriminate between good and poor em- 
Ployees. The means show no large differences 
etween age groups. Inspection of the total 
'stributions also indicates little difference m 
© extremes. The comparatively small num- 
®t so good as to obtain favorable comments 
totaling 10 or more are all under 60. But 
Per cent of the men 60 and over with two 
ears’ service scored over3;8s compared with 
. Per cent for the youngest group. ae 
Ha Over two years’ service showed 52 per se A 
a the 18-29 group so scoring as cone 
‘th 53 per cent of the few cases 60 and ov bl 
One of the age groups showed any appreciable 
To ber of individuals scoring less than ones 
Tee. Both the means and the distribution: 
Or the various age groups, of the net evalua- 
EN for workers with over two years aie 
ae to indicate the value of retaining © 
Orkers, ‘a 
a Table 2 the last four columns bes hae 
Ten ved during the pasi i one age 
(ir Show both a steady rise in mean Ped 
OLO: to 3.0) and a larger percentag 


old dings 
age Workers scoring over 3. i: A a ps 


Ose recen ired are fel ‘ : 
k Portant, qf im hired even in their 
ties are found satisfactory, it would mee 
bet 4 large labor pool of older workers cou 

Te to in time of need. 
le 3 gives distributions and mean 


aPpra; 
PPraisals of employees 60 and ovéets 


af 
the personne! 
ere not ae 
; evalu- 
£ ase Supervisors are older persons; RT es 
em been included, the net Ta higher. 

€ older age groups might have 


s of net 
in a 
1 


Be 
Tecorge USE of the confidential nature of 
any o Evaluations of key supervisors W 
ation, 
vorin 


299 


Table 3 


Distributions of Net Appraisals of Competence of All 
Employees 60 and Over for Those Both Over 
and Under Two Years’ Service* 


Males Females} 
Net 
Appraisal 60-64 65-69 70-up 60-64 65-69 
9 4 2 1 
8 4 1 3 
7 6 3 2 1 
6 19 6 1 
5 19 3 3 1 
4 19 3 4 3 1 
3 20 6 1 3 
2 20 8 1 1 
1 20 5 3 3 
0 13 9 3 1 
-1 15 4 1 
—2 7 
—3 5 1 
—4 1 al 
—5 
-6 1 
Total 169 54 17 15 7 
Mean 25 31 3.1 43 2.9 


* Figures in boldface indicate intervals in which the 
means fall, with the exception of the last column in 
which the interval containing the median is boldfaced. 

+ No women were listed as 70-up. 


breakdown by five-year intervals. Cases over 
and under two years’ service are here combined. 

A considerable number of men over 65, and 
some even 70 and over, were on the payroll; 
and none in this oldest group was given a 
minus appraisal. A large proportion of all 
those of both sexes, 65 and over, appear to 
have been distinctly satisfactory. 


Major Conclusions 


The organization in which this study was 
made employs many older workers, including 
an appreciable number over 65. That most 
of these older workers were considered compe- 
tent on the job and compared favorably with 
younger workers in the organization would 
seem a justification for their employment, 
evidence of wise personnel policy in their use, 
and evidence against arbitrary retirement at 65. 

That men hired for the first time when over 
45 and even when 60 or over (male workers 


300 


with less than two years’ service in the upper 
two age groups) were given higher net ap- 
praisals than younger men currently being 
hired seems even more striking vindication of 
the older worker as being competent. The 
common business practice of refusing to hire 
a new worker over 45 seems even contra- 
indicated—at least for some types of work, 
since the older applicant seems to be the better 
risk! At all ages there are great individual 
differences. Clearly, selection should be on 
other bases than mere chronological age. 
Certain changes in traits with age, and large 
individual differences in traits at every age, 
all indicate the need for individualized selection 
and placement, and for job readjustment of 
some with increasing age or because original 
placement was in some respect unsatisfactory. 
Many such readjustments were made in this 
organization. More such readjustments might 


be expected to make older workers even more 
satisfactory. 


Summary 


Current census data suggest that by 1960 
there will be nearly 15 million persons aged 
65 and over, and 35 million between ages 45 
and 64 or a total of 50 million over age 45. 
From the standpoint of economic balance and 
defense, the manpower potential of an increas- 
ing older population should be carefully 
studied. 

The present investigation involved a detailed 
appraisal of the personnel records of several 
thousand workers between the ages of 18 and 
76 in a large organization. The frequency of 
mention of each common trait at each age was 
determined, and the net total of favorable 
and unfavorable traits mentioned by super- 
visors for each employee was used as an index 


of hig Worth. The following findings stood 
out: 


1. Older workers were mentioned somewhat 
more frequently than younger as not learnin, 
readily and as being slow; but the older ma 
were considered to be better in attendance 
steadiness, and Conscientiousness. The re- 


William H. Bowers 


maining traits showed no consistent changes - 
with age. 3 

2. Net appraisals of workers showed negli- 
gible differences between age groups but great 
individual differences at every age. Employees 
over 65, and even 70 and over, were appraised 
very favorably. It would seem clear tha 
arbitrary retirement at 65 or hiring discrimina- 
tions against employees because of age have 
little warrant. 

3. The organization has in the last two years 
taken on many workers over 45 and even over 
60. Most of these were appraised favorab 
Age limits in hiring seem clearly questionable; 

Quite evidently workers should be employ t 
and retained on the basis of merit wills 
reference to age. Biases and misconception 
limiting the use of older persons should a 
replaced by facts. Oldsters can maintain pa 
ductivity, thus making an extended produ é 
tive life worth while, strengthening manpower 
resources, and lessening possible econom' 
burdens resulting from dependency of larg 
numbers of non-productive older persons. 


Received July 7, 1952. 
Early publication. 


References 


1. National Conference on Ageing. Man and his rh 
Health Publications Institute, Raleigb, 
Carolina, 1951. 

2. New York Joint Legislative Committee on Prob old, 
of the Ageing, Albany, N. Y. Never tong 1p 
Legislative Document No. 32, 1949; Ve” 1. 
grow old, Legislative Document No. 12, 1952 

3. Parran, T. Must you retire at 65? Colliers, 

129, 15. ent for 

4. Stanton, Jeannette E. Part-time employ eed 35, 
the older worker. J. appl. Psychol. 199! 
418-421. he age 

5. Tibbets, C. (Ed.) Social contributions byi 258 
ing. Ann. Amer. Acad. pol. soc. Sci., 1952; 
279. 


Jem 


atis, 

6. U. S. Department of Labor, Bureau of Labor Sr of 
tics. Fact book on the employment prob ce oF 
older workers. Prepared for the Conferen 195% 
Ageing, Washington, D. C., August 13-1 n oy" 

7. Welford, A. T., and Speakman, D. The oni 
bility of older people. The aged and Sor 
Champaign, Illinois: Twin City Printing 
pany, 1950, 198-199, 


A 


Relation between How Supervise?, Intelligence and Education for a 
Group of Supervisory Candidates in Industry* 


Frederic R. Wickert 
Michigan State College 


Informal impressions indicate that the File 
and Remmers questionnaire How Su pervise? (3) 
Continues to be rather widely used to measure 
the attitudes and knowledges required for 
Supervisory success, notwithstanding its appar- 
€nt deficiencies as a measuring device for this 
Purpose, One rather obvious difficulty likely 
to be encountered in interpreting the test 
Scores of industrial supervisors is, of course, 
weit the scores may be measuring facility with 
Words, or educational level, or verbal intelli- 
Bence, rather than supervisory abilities or 
aptitudes, 

The literature concerning the relationship 
Riven How Supervise? and educational level 
a intelligence at first glance has not, until 
aA tly, been too informative. E 

= (D reported a correlation of ee 
ed = 577) between How Supervise? scores an 5 

Ucation, He said with respect to this cor 
maton, “The optimum amount of correlation 
ca Should exist between education ne a 
Whi of supervisory quality is problemani: 
vith such a test should not correlate : ig ly 
edun mount of education, doubtless, format 

jon ation does provide valuable leans ae 
na Which are generally helpful.” He fur he 

cated that much of the positive relations p 

ound between scores on his test Ha 

"cation was due to the difference ìn = 
Commence between those with oe a 
Planted to those without college. 41° ~ 


i i rence 
X E relationship without any refe 
o 


ssible influence of verbal inteligenco 
training in thinking through ot 
Skil} lems or of selection of persons with te : 
lever He said, “, | . Selection at the Pa g 
faileg €nds to ‘weed out? individuals a the 
Eene: to develop an understanding ° Fe 
tal factors of human relations, OY 


* 
p ughton 
Nba Grimsley and Dr. John F Maen project 


Which €d princi i ts of 
ipally in those aspec! client 
Compan èd a bearing on giving service to the 


301 


colleges . . . provide considerable opportunity 
for gaining insight into human relations prob- 
lems.” 

Sartain (6) reported a correlation of —.44 
between the Adaptability Test and an “experi- 
mental” form of How Supervise? which was 
scored in such a way that “low scores” indi- 
cated “a favorable attitude.” The N was 40. 
In the text of his article, Sartain stated that 
his results indicated “that general mental 
ability goes with favorable supervisory atti- 
tudes to a moderate degree.” Sartain’s nega- 
tive correlation coefficient evidently indicated 
a positive relationship. 

File (2), in an article: following Sartain’s 
report, attempted to explain the difference 
between his finding of a positive correlation 
(+ .35) between How Supervise? and education 
and Sartain’s negative correlation (—.44) 
between the Adaptability Test and How Super- 
vise?, without apparently appreciating that 
Sartain’s negative correlation meant that a 
positive relationship really existed. Accord- 
ingly, Sartain’s findings do not contradict 
those of File, and File need not have been so 
concerned about reconciling the results of the 
two studies. 

Two recent articles have done much to 
clarify the interrelationships with which this 
paper is concerned. In the first article, 
Millard (5) reported that for office (higher 
level) supervisors the correlation between scores 
on How Supervise? and the Adaptability Test 
was + .22. However, the correlation for a 
group of factory foremen (lower level or first 
line) was + .71. 

The second recent article is that of Maloney 
(4). He made a Flesch analysis of How Super- 
vise?. His findings were: “The mean reada- 
bility for both forms, directions included, is of 
high school graduate difficulty.” Maloney 
then suggested that Millard’s high correlation 
between How Supervise? scores and scores on 
the Adaptability Test for foremen was not 


302 


surprising since both How Supervise? and the 
Adaptability Test measure the ability to under- 
stand written material. Maloney concluded, 
“\ . . for lower level personnel How Supervise? 
is of doubtful validity as a measure of super- 
visory ability.” 

The present study most resembles that of 
Millard. However, rather than reporting cor- 
relations between How Supervise? and an 
intelligence test for two groups obviously 
differing not only in educational level but also 
in social status, type of work supervised, etc., 
the present study reports the relationship 
between How Supervise? and education and 
intelligence for but one group of subjects and 
then divides the total group into upper and 
lower educational groups. 

The data here reported are only a small part 
of the data collected in connection with a 
larger study. This larger study involved, in 
part, administering a number of psychological 
tests to (including Form A of How Supervise? 
and the Advanced Short Form of the Cali- 
fornia Mental Maturity Test), as well as 
obtaining certain biographical data on, a 
group of almost 100 candidates for supervisory 
positions, These men were employed, almost 
all in the shop rather than the office, of one 
plant of a large metal products company. 


Results 


Table 1 shows the interrelationships among 
scores on How Supervise?, the “language fac- 
tors” score on the Advanced Short Form of the 


Table 1 


Intercorrelations among How Supervise? Scores, Scores 


on the Language Aspects of a General Intelli- 
gence Test, and Years of Education 


Variables 
Tests and Education A a 
1. How Supervise? 
.66* 4 
2. Verbal Intelligence +.66* ià i z 
3. Years of Education +454 +4.59 P 


*N =95. All other correlation 
on an N = 87. (All correlation: 
cause these data were lifted from 
correlations where the numbers 
computed precluded the use of 
of correlation.) 


coefficients are based 
S are tetrachoric be- 
a large table of inter- 
of coefficients to be 
more refined measures 


Frederic R. Wickert | 


Table 2 


Relation between How Supervise? and Verbal (ea 
gence for the Well Educated and the Less Weell 
Educated Supervisory Candidates t 
— 


Tetrachoric r Fi 


How Supervise? 
and a Verbal N 
Intelligence Test 
ae 


7 


+20 j 57 
+65 , 30 


Educational Level 


12th grade education or more 
11th grade education or less 


California Mental Maturity Test, and years 
of education. All three of these variables 
appear to be rather substantially related to 
each other for the total group of supervisory 
candidates tested in this study. j 

In order to analyze these interrelationships 
further, Table 2 was prepared. 

Table 2 shows the relationship between 
How Supervise? and verbal intelligence with the 
effect of education crudely partialed out. 
is clear from Table 2 that among the better 
educated group there is comparatively little 
relationship between How Supervise? scores an 
intelligence. A high degree of supervisory 
knowledge of the sort measured by Ho 
Supervise? does not seem to depend particularly 
on good verbal intelligence among this better 
educated group. 

For the less well educated group, howevels 
How Supervise? scores and intelligence te 
scores are quite closely related. Ability to 8° 
a high score on How Supervise? among t ie 
group would appear to depend to an appreciab s 
extent on possessing good verbal intelligen 


Discussion 


The data reported here are in striking agt®° 
ment with those of Millard. In addition, thet 
fit in nicely with Maloney’s findings. It W F 
be recalled that Maloney reported that 4 
Flesch analysis put the readability level 
How Supervise? at the high school gradun g 
level; the present study found the best autona 
point between those subjects whose , n 
Supervise? scores were highly related to int 
ligence test scores and those subjects for whet 
this relationship was negligible, was also 
just about the high school graduate level- w 

The finding of the present study that H 


J 
. 
4 
4 


Relation between How Supervise?, Intelligence, and ‘Education. 


Supervise? scores and verbal intelligence are 
rather closely related to each other among the 
Subjects with less education suggests strongly 
that How Supervise? scores for these people 
ew be interpreted with caution. On the 
Mace, How Supervise? might almost as well 
Provide a measure of general verbal intelligence 
myg measure of supervisory knowledge and 
ppa eStanding among this group with relatively 
tle education, Maloney’s findings, however, 
Bett to the interpretation that for this low 
penton group the high relationship between 
om Supervise? and intelligence test scores 
=. not because How Supervise? is an intel- 
e float test, but because, for the less well 
ee Persons, readability is tip HEN 
aae Aa Ed ow Supervise?. ‘ [ meN 
tional a aS 10 the extent to wh Fa 
abilit attainment is an indication of reading 
dificulty test can be read only with Brea 
EES, by the average factory S oer 
vise? ig e, for lower level personnel How ar 
Superyj of doubtful validity as a measur 
Tvisory ability,” 
Scores ee finding that How 
telateg © Verbal intelligence are on 
ed, when the test scores are those for 
ts with lucation, suggests that 
Tes on t} ely mec H y Supervise? 
Can he he present forms of How ae 
Used With some confidence that y 
understa casuring supervisory knowledge and 
ligence anding apart from general verba mie 
With end readability) when one 15 vi ing 
twej Sons who went through at leas 
rade, 


Supervise? 
ly slightly 


Thi Summary 
Sen Is Study surmmiaxizes past data and pre- 
fe, et data on the interrelationshiP® 
deni How Supervise? (a widely used test S 
ang p SOY knowledge), verbal intelligence, 
this gp unt of education. The subjects for 
“Uber; Y were almost 100 candidates for shop 
Pany, ‘Sory positions in a metal products om 


jables 


esy 
Vere Its Showed that these three var as 


Ore highly interrelated than early st 


303 


had reported. Further analysis of the results 
of the present study corroborated a rather 
recent study by Millard in that How Supervise? 
and verbal intelligence test scores were quite 
highly related among the less well educated 
of the total group tested. In fact, the relation- 
ship was so close that there seemed to be a real 
question with respect to whether How Super- 
vise? could measure supervisory knowledge 
apart from general intelligence for this less 
well educated group. Another recent study 
by Maloney, however, suggested that results 
like Millard’s (and inferentially those of the 
present study) could be accounted for by the 
fact that How Supervise? is difficult to read 
for those whose educational level is relatively 
low. 

The general conclusion is that How Super- 
vise?, for those persons who did not graduate 
from high school, measures intelligence (or 
readability of How Supervise?) rather than 
knowledge of the principles of supervision. 
However, for relatively well educated persons, 
How Supervise? scores have little relationship 
with intelligence test scores (or readability). 
When testing this better educated group, there 
would seem to be some assurance that How 
Supervise? can measure supervisory knowledge 
apart from verbal intelligence (or the reada- 
bility of How Supervise?). 


Received November 19, 1951. 


References 


1. File, Q. W. The measurement of supervisory qual- 
ity in industry. J. appl. Psychol., 1945, 29, 
323-337. 

2. File, Q. W., and Remmers, H. H. Studies in super- 
visory evaluation. J. appl. Psychol., 1946, 30, 
421-425. 

3. File, Q. W., and Remmers, H. H. How Supervise? 
(revised manual). New York: Psychological 
Corporation, 1948, pp. 8. 

4. Maloney, P. W. Reading ease scores for File’s How 
Supervise?. J. appl. Psychol., 1952, 36, 225-227. 

5. Millard, K. A. Is How Supervise? an intelligence 
test. J. appl. Psychol., 1952, 36, 221-224. 

6. Sartain, A. Q. Relation between scores on certain 
standard tests and supervisory success in an air- 
craft factory. J. appl. Psychol., 1946, 30, 328- 
339. 


Changes Occurring in Teacher-Pupil Attitudes during a 
Two-Weeks Guidance Workshop 


Jack Shaw, Herbert J. Klausmeier, Arno H. Luker, and Howard T. Reid 
Department of Educational Psychology, Colorado State College of Education 


Several investigations in recent years using 
the Minnesota Teacher Altitudes T nventory have 
shown that teacher-pupil relationships are 
closely related to teacher-pupil attitudes, that 
teacher-pupil attitudes can be validly meas- 
ured, and that the effects of teacher education 
and teaching experience on these attitudes 
can be determined (1, 3, 4, 5, 7). The relia- 
bility of the MTAI (Form A) was reported to 
be .82 based on test-retest data one week apart 
(2). The validity of the MTAI was deter- 
mined empirically with a group of experienced 
teachers by its authors who found that it 
correlated +.60 with a criterion of the com- 
bined ratings of principals, ratings of observa- 
tions by one of the authors, and ratings by the 
pupils of the teachers. Further evidence of 
validity of the inventory was found by Shaw 
using a criterion of practice teaching super- 
visor’s ratings (9). 

The research by the authors of the MTAI 
indicated that during the pre-service education 
of teachers statistically significant changes in 
teacher-pupil attitudes did occur. Nothing 
has appeared as yet in the literature, however, 
to indicate whether or not changes in teacher 
attitudes occur during post-graduate training 
of experienced teachers. The authors of this 
article undertook to investigate this problem 
among experienced teachers, counselors, and 
administrators enrolled in a two-weeks guidance 
workshop, 


Subjects 


The subjects of the study were 158 teachers, 

. counselors, principals, and superintendents en- 
rolled in a two-weeks guidance workshop held 
at Colorado State College of Education in 
June, 1951. The workshop was taken for three 
quarter-hours’ credit towards the B.A., M.A., 
or Ed.D. degrees. These students came from 
23 states and Hawaii. There were 99 men 
and 59 women in the class. One student had 
freshman standing, 4 were Juniors, 9 were 


seniors, 110 held bachelor’s degrees, 33 held 
master’s degrees, and one had earned a special- 
ist’s diploma one year beyond a master’s degree- 

Their experience in the educational profes- 
sion (including teaching, counseling, super- 
vision, and administration) ranged from 0 to 
35 years, with a mean of 9.8 years and a median 
of 7 years. Only 12 students had no teaching 
experience in the field. The hours of course 
work in educational psychology as reported 
by the members of the workshop ranged from 
0 to 105 quarter hours with a median of 1 
quarter hours. Their major fields of study 
were reported as follows: educational psychol- 
ogy, 23 members; educational administration, 
30; elementary education, 49; and secondary 
education, 56. 3 

Some indication that the teacher-pupil atti- 
tudes of these subjects at the time of registra- 
tion in the workshop were not widely atypic? 
of graduate students in education is seen 
comparison of the group with a norm grouP 
of 115 graduate students in education describe 
in the manual of the MTAI (2). The work- 
shop group, consisting of 158 members, had # 
mean score of 53.0 on the MTAI with ° 
Standard deviation of 25.5. The norm grouP 
reported in the MTAT manual, consisting ° 
115 members, had a mean score of 48.2 with # 
standard deviation of 27,5, 


The Problem 


Specifically, the problem under study w° 
twofold: 1. What are the differences in teach” 
pupil attitudes as measured by MTAI amonB 
the teachers enrolled in the guidance workshop 
in relation to their major fields of stu®? 
2. What changes in scores on the MTAI i 
curred during the two-weeks workshop “i 
among the group as a whole, and (b) amo”! f 
teachers grouped according to major fields ° 
study? 


304 


Changes Occurring in Teacher-Pupil Attitudes 


Procedures 


The MTAI (Form A) was administered to 
€ teachers in the workshop at the beginning 
K the two-week period, prior to any lecture, 
qwcussion, or work on the workshop topic, 
th he Improvement of Guidance Services in 
€ Schools of America.” The workshop mem- 
Fe were told that the inventory was being 
mie tered as a good teaching procedure 
aed the faculty of the workshop might 
a t understand the teachers with whom they 
(in p Eang to work for two weeks. Scores 
e ranks based on MTAI norms pro- 
Stude y the inventory authors for graduate 
i a in education) were posted by student 
an S on the bulletin board. Without ad- 
ce notice, the MTAI (Form A) was again 
istered on the last day of the workshop. 
evaluat dents were assured its purpose was a 

in n € the proceedings of the workshop an 
© Way would the scores of the inventory 


E Used ` 
Sed in assigning marks for course credit. 


Results 


results of the initial and final admin- 


nS of the MTAT are shown in Table É 


. the 

ptio 
XaMmina +: 

ination of the table shows that the four 


su l 
Broups and the total group made statis- 


tic 
i ; the 
di: Significant as measured by 
iffe increases apd fel 


. ence : E 1 

‘ye 1n the means of the initia! ant 

Show TY scores. The total group (N is 158) 

ditecti changes in attitudes in a favorable 

Ueno from a mean of 53.0 to 69.2, the 
pe 162 being statistically significant 


305 


at better than the one-tenth of one per cent 
level.! 

The order of greatest mean gain by major 
groups was: secondary education, 18.7; educa- 
tional administration, 18.4; educational psy- 
chology, 14.6; and elementary education, 12.5. 

The order of the groups according to means 
on the initial administration was: educational 
psychology, 65.1; elementary education, 57.6; 
educational administration, 49.3; and second- 
ary education, 46.1. The same order was 
found in the second administration of the 
inventory, although the differences between 
the means of the successive groups were 
reduced except between educational psychology 
and elementary education. 

The authors of the inventory reported that 
changes in attitude scores accompanied pro- 
fessional education courses in the pre-service 
education of students for the teaching profes- 
sion (1, 4). The results of this study indicate 
that the attitudes of experienced teachers 
returning for refresher or graduate training 
changed in a favorable direction during a 
two-weeks guidance workshop. 


Implications 


The results of this study further support 
the findings of the authors of MTAT that 
professional courses in education can result in 
change in attitude scores. Whereas previous 
findings indicated that such changes seemed 
to occur primarily in the early professional 


1 Test of significance of difference between means for 
correlated data was used. 


Table 1 


Co 


Mpari 
Son of the Scores Made by the W 
the Initial Test an 


orkshop Members 
d Retest accordin, 


on the Minnesota Teacher Attitudes Inventory on 
g to Major Field of Study 


Major Field of Study 
ed, Admin. Elem. Ed. Sec. Ed. Total Group 
Ed. Psych. Ed: -= ‘tial z 

tial Re- Initial Re- Initial Re- Initial Re- 
Tiea Be Jnitia! at Test test Test test Test test 

N Test test Test D EA = i 
M 3 30 49 5 158 158 
st 3 23 1o oa Sto Wi 46.1 64.8 53.0 69.2 
i 65.1 79.7 23 do 177 189 282 22.0 25.5 16.0 

Piny 27.9 118 24 ra 3.43 6.13 11.17 

babii 2.66 4. 01 001 001 

MY less than 02 .001 


306 


courses in a teacher-education program, this 
study indicated that changes occurred in post- 
graduate courses in education. 

Furthermore the results of this study support 
the findings of the authors of the inventory 
that the MTAI differentiates among teachers of 
different grade level. Whereas it was previ- 
ously shown that the order of the better 
attitudes among teachers was first early 
childhood teachers, second intermediate grade 
teachers, third senior high school teachers, and 
fourth junior high school teachers, this study 
showed a hierarchy according to field of study 
as follows: (1) educational psychology; (2) 
elementary education; (3) educational admin- 
istration; and (4) secondary education. Pos- 
sibly these orders represent some form of a 
continuum from interest in children to interest 
in academic subject matter. This continuum 
in one study was found to exist among ele- 
mentary and secondary teaching candidates 
in the sophomore year of their college educa- 
tion (6). 

Because a previous investigation (1) found a 
deterioration in attitudes as measured by 
MTAI among teachers during their first year 
of teaching experience, the question arises 
whether or not the gains in attitude scores 
made during professional courses reflect real 
changes in attitudes or merely an increase 
in facility with attitudinal terminology which 
may readily be forgotten after a few months 
away from the educational institution. The 
present study should therefore be followed up 
at the end of the subsequent school year to 
determine whether or not the gains in favorable 
attitudes of experienced teachers made in a 
two-weeks workshop are lost during the follow- 
ing year of additional experience in the field. 


Shaw, Klausmeier, Luker, and Reid 


Finally it has been demonstrated by this 
study that the MTAI may be a useful instru- 
ment in evaluating the effectiveness of pro- 
cedures or techniques such as a workshop 
in the professional education of teachers, 
counselors, and administrators. 


Received October 10, 1951. 


References 


1. Callis, R. Change in teacher-pupil attitudes related 
to training and experience. Unpublished Ph.D. 
thesis, University of Minnesota, 1948. 

2. Cook, W. W., Leeds, C. H., and Callis, R. 
for the Minnesota teacher attitude inventory, Jor 

A. New York: The Psychological Corporation, 
1950. 3 

3. Cook, W., and Leeds, C.H. Measuring the teaching 
personality. Educ. psychol. Measmt, 1947, h 
399-410, 

4. Cook, W. W., Leeds, C. H., and Callis, R. Pre 
dicting teacher-pupil relations. The evaluation 
of student teaching. 28th Yearbook, The Asso” 
ciation for Student Teaching, 1949, Ch. 4 

5. Fuller, Elizabeth M. The use of teacher-pupil attr 
tudes, self-rating and measures of general ability 
in the preservice selection of nursery school 
kindergarten-primary teachers, J, educ. Res» 
1951, 14, 675-686. 

6. Klausmeier, H., Luker, A., and Stromswold; 5: 
Factors influencing choice of teaching C%e® 
among college sophomores, J. educ. Resy 19 t 
45, 23-32. 1 

- Leeds, C. H. A scale for measuring teacher-PUP 
attitudes and teacher-pupil rapport. psycho 
Monogr., 1950, 64, No. 6 (Whole No. 312). 4 

- Leeds, C. H., and Cook, W. W. The construct.” 
and differential value of a scale for determing 
teacher-pupil attitudes. J, exp. Educ. 

16, 149-159, tet 

- Shaw, J. The function of the interview in es 
mining fitness for teacher-training. 
Res., 1952, 45, 667-681. 


Manual 
m 


i 


Personality Characteristics Related to Success in Practice Teaching 


Harrison G. Gough 
University of California, Berkeley 


and 


William H. Pemberton 


Consulting Psychologist, San Francisco, California 


The importance of personality characteristics 
for tasks involving personal interaction, leader- 
ship, and social understanding is incontestable. 

he difficulty in utilizing a principle such as 

1s lies more in devising techniques and 
methods for its adequate application than in 
Proving the truth of the basic assumption. 
vances in the methodology of personality 
“sessment and evaluation have yielded various 
‘struments which show promise of overcoming 
'S technological barrier. This study is con- 
sned with the application of one of these 
vices, the Minnesota Multiphasic Personality 
ventory (3), to the problem of predicting 
success in student practice teaching. R 
of „ S investigation is a revision and alam 
an earlier one carried out by Pemberton ( 


ae doctoral dissertation in the School a 
“cation at the University of California, 
le of 96 


oo Tn that project a samp 
a enrolled in a a in secondary sio 
the aice teaching was studied by mas 
Score, Pl and the group Rorschach. 
ing a on MMPI scales nor conve 
the tegories on the Rorschach yie ge 
teachir Y of significant relationships to P 

Mg ratings! l 
ft the same time, a clinical-in 
a posi MPT profiles (by H. G: G) a 
esult lve correlation with the rat aed 

S Of this sorting are presented in se 
favorable outcome for this Proce 


ntional scor- 


Jded much in 
tice 


wice 
pats in practice teaching were, Ta Per- 
ree Semester on each of oe vour (2) Cons 
“Hons wi ents an h skill; 
Cia use of meee matter; (3) Teac a 
Malitagy Management. The ratings Wer 
an ive terms: outstanding, excellent, gs on vari- 
ay Tn Pemberton’s study © to i 
i A (4) were used, i ae effort 
ine Of personality character 
‘Orme investigation the qualitative rating’ the entries 
THE then Scale of 5, 4, 3, 2, and 1, AX 5) ratings- 
sue totay_S¥™mated over the eight @ % Sia for the 
Udy, “l scores thus derived served as crite 


1 St 
durin ed 
n 
Sona) ® t 


suggested that the MMPI profiles did, in fact, 
depict significant facets of personality, but that 
the relationships were obscured by a conven- 
tional scale-by-scale analysis. This inference 
is in line with an abundance of previous work 
with the MMPI? which has underscored the 
necessity of considering patterns and configura- 
tions of scores in interpreting test profiles. 
The results of the present paper, it might be 
emphasized, lend further support to this 
admonition. 
"able 1 


Evaluation of Clinical Sorting of MMPI Profiles of 
Male Student Practice Teachers 


Practice Teaching Ratings 


Higher Average Lower 
Ratings Ratings Ratings 
& | Higher 9 14 3 26 
E | Average 13 20 11 44 
& | Lower 4 10 12 26 
26 44 26 96 


x? = 8.51; d.f. = 4; and P > .05, < .10. 


In the earlier study the test results of the 
female students were not considered. They 
were later scored, and the profiles given to 
another clinical psychologist for ratings on 
‘potential success in practice teaching,” The 
correlation of the clinical ratings with the cri- 
terion ratings for the 58 profiles involyed was 
+.24, S.E. .13. This prediction was achieved 
despite the fact that no single scale on the 
MMPI showed a significant relationship to 
the criterion ratings. 

The decision was made, accordingly, to 
attempt to submit the intuitive cues used in 
test interpretation to analytical and quanti- 

2 See for example: Gough, H. G. Diagnostic patterns 
on the MMPI. J. clin. Psychol., 1946, 2, 23-37. 


3 The writers are indebted to Dr. George S. Welsh 
for furnishing these ratings. 


307 


308 


tative investigation to determine whether any 
objective signs or indices on the MMPI could 
be isolated which would possess useful pre- 
dictive power for the criterion of success in 
practice teaching. 

The first step was to categorize both the 
male and female samples into those with 
higher, average, and lower ratings, and then 
to calculate mean scores for each of the MMPI 
scales available. Some of the original cases 
had not taken the complete form of the MMPI, 
and as we desired to use a number of non- 
Pathological scalest these records were elimi- 
nated. The total sample included 89 males 
and 58 females. 

Table 2 presents the results of this analysis 
for the female sample.’ None of the 22 MMPI 
scales considered yielded a significant F-ratio 
when evaluated in this fashion. 

The results for the male sample appear in 
Table 3. Three of the scales show some prom- 
ise in this analysis. The more successful male 
practice teachers tend to score lower on Hy 
(hysteria), lower on Pd (psychopatic deviate), 
and higher on Psy (psychological aptitude). 
Besides the scales listed in the two tables, two 
of the indices recently proposed by Welsh (5) 
were evaluated. Neither the anxiety index 
- (AT) nor the internalization ratio (IR) revealed 
a significant relationship. 

The results with individual scales confirmed 
the findings of the earlier analysis and indicated 
that any relevant factors on the MMPI would 
have to be identified by a more configural and 
intuitive approach. The steps involved in 
translating this deduction into a research 
design were: first, a great deal of free associa- 
tion and ratiocination about the signs and 
clues Involved in the intuitive sortings; and 
second, an attempt to write simple indices and 
Spm: Summarizing them. Altogether some 
15 “signs” were devised and tested against 

í The additional scales used were: Re (social responsi- 


bility), Do dominance), st (status), Sp (social participa- 


Ac (academic achievement, 


high school), Ds (dissimulation) Ps i 
aptitude), Ie (intellectual eficiency)> ai gree 
introversion). These scales 


. a have been described i 
series of publications too long to be TE jere: 


As examples the reader is referred to land 2. 

5 To reduce printing costs Tables 2 and 3 have been 
deposited with the American Documentation Institute. 
Order Document 3661 from American Documentation 
Institute, 1719 N Street N.W., Washington 6, D. C 
remitting $1.00 for microfilm (images 1 inch high on 
standard 35-mm. motion picture film) or $1.00 for 
photocopies (6 X 8 inches) readable without optical aid. 


Harrison G. Gough and William H. Pemberton 


Table 4 


MMPI “Signs” Predictive of Success in 
Practice Teaching* 


Sign x? d.f. P 
1. Pa>50, <56 7.33 2 >.02, <.05 
2. Ma>48, <60 6.74 2 >.02, <.05 
3. Pa+L<56 3.27 1 >.05, <.10 
4. Ma>48, <60 
and 
Pd>46, <58 2.88 1 >.05, <.10 
5. Hy>Pd+1} 1.69 1 >.10, <.20 
6. K+Ie+3 Do> 
Pt+Sc+Pa+15f 1.22 1 >.20, <.30 
7. St+Re+Te>102} 284 1 >.05, <.10 
8. Ma+K+D> 
Pd-+Hs+Pt 175 L >.10, <.20 


* All scales are expressed in T-scores, except L, 1e 
(intellectual efficiency), St (status), Do (dominance); 
and Re (responsibility), 

t These constants were selected so as to divide the 
sample of 147 as evenly as possible into those with sign 
present and those with sign absent. 


the threefold criterion breakdown.* Eight of 
these signs revealed some dicriminatory p07 
tential. These signs are listed in Table 4 

After these eight signs had been established 
each of the original records was scored for 
presence or absence of each sign, with a tota 
“sign-score” also being obtained. With eight 
signs, this score could range from zero to eight. 
As would be expected, these “sign-scores 
were significantly related to success in practice 
teaching. The results are given in Table 24 

A separate analysis was made for each s¢* 
For females the x2 was 1:8, Gh. 4, PA AG 
< .20. For males the x? was 9.3, d.f. 4, ane 
P> 05, <.10. For the combined sa™P’ 
only 6 of the 147 cases are misclassified 
two steps, and 71 cases are exactly classi Oa 
on the diagonal. The over-all predictive effi a 
ency would thus appear to be good for h 
sample. 

A cross validation of these signs was also 
attempted. MMPI profiles of students © 
rolled in courses in primary grades practic 
teaching were obtained,7 Eighty cases (40 f 
each sex) with quite high ratings were selecte® 
and a second sample of 80 (40 of each sex) W? e 
rather low ratings was also chosen. me 
students had not taken the full form of i 


is 
ê The two subsamples were pooled for this analy" 
Providing a total of 147 cases. 


ds 
Dr. Fred Tyler y í reco 
available. y ery kindly made these 


Personality Characteristics Related to Success in Teaching 309 


MMPI, so we were unable to compute the 
three signs involving K, Ie, Do, St, and Re. 
The results of this analysis are presented in 
Table 6. 

The x? here is significant between the .02 
and .05 levels. From Table 6 it appears that 
the signs are more effective in identifying better 
Practice teachers than in specifying poorer ones. 

fact, there is actually a slight tendency for 
the sign-score to work “backwards” at the 
lower ‘end. We would not anticipate this 
result if all eight signs were used, but at the 
moment there is no evidence for this disclaimer. 

Udging from the cross-validational evidence 

e method of profile interpretation advocated 
aa does possess validity for predicting success 

Practice teaching. Its practical efficiency 
aY not be high, but the empirical confirma- 
on of the methodology indicates that a sys- 
metic and diligent search for signs and P 
utilit might well yield predictors of peoa 
me: Certainly the results highlig eee 
ae °F involved in concluding that the MMP 

es not “Work” because single scale analysis 


ti 


The Predictive efficiency observed in this 
P Table 5 M 
tedictive Efficiency of Eight MMPI “Signs 
in the Original Sample 


Practice Teaching Ratings 


Higher Average Lower 
š Ratings Ratings Ratings 
2 a 1 29 
Š : io r 87 
Zs [o9 19 46 22 - 
= 5 13 13 
147 
36 75 36 
Wia 
T49; d.f. = 4; and P < 01. 
pS Pak uld be to 


n A . AA 
pay Micchnique for searching for signs ie e Meehl, 
1950" 


owl's Configural method for Hons te Psychol., 
, 


tigation” 165-171) to pairs of scales. A cu 


i O; zi i 
png resul With other diagnoses has aches Sullivan, 
: S utilizing this modification. e for objective 
J. consult. 


aticg Study ‘similar to the present one Y 


a 
a og po a large scale (N > 500 in bol 


* st 

nately ever, the safidaiton of the sign shot ouid 

= be lep POn demonstrable relationships, an jecture- 
cft in the realm of speculation and con) 


Table 6 


Cross-validational Predictive Efficiency of 
Five MMPI “Signs” 


Practice Teaching Ratings 


pe aia a ae 
Higher Lower 
Ratings Ratings 
o¢ 4-5 27 15 42 
Ba | 23 39 34 93 
Zs [0-1 14 11 25 
80 80 160 


x? = 6.21; d.f. = 2; and P > .02, < .05, 


study is also, of course, attenuated by the 
unreliability of the criterion. It has not been 
possible to evaluate the ratings systematically, 
but the impression is gained that ratings vary 
considerably in accuracy and value. A further 
study such as the present one would want to 
insure the obtaining of a valid and reliable 
set of criterion measures if a goal of practical 
forecasting efficiency is to be met. There is 
also the possibility that better results could be 
achieved by differentiating the criterion—per- 
formance in practice teaching—into its con- 
stituent phases, instead of treating it in an 
over-all way. 
Summary 


An attempt was made to predict success in 
practice teaching from personality test meas- 
ures. Single scales on the MMPI showed 
little validity but various patterns and indices 
revealed considerable promise. Certain meth- 
odological implications of this study for projects 
devoted to goals of practical assessment and 
evaluation were discussed. 


Received Seplember 28, 1951. 


References 


1. Gough, H. G. A new dimension of status: I. De- 
velopment of a personality scale. Amer. sociol. 
Rev., 1948, 13, 401-409. 

2. Gough, H. G., McClosky, H., and Meehl, P. E. 
A personality scale for dominance. J. abnorm. 
soc. Psychol., 1951, 46, 360-366, 

3. Hathaway, S. R., and McKinley, J. C. The Minne- 
sota multiphasic personality inventory. Minne- 
apolis: University of Minnesota Press, 1943. 

4. Pemberton, W. H. Test characteristics of student 
teachers rated at the extremes of teaching ability. 
Doctoral dissertation, Univ. of Calif., Berkeley, 
1950. 

5. Welsh, G. S. An anxiety index and an internaliza- 
tion ratio for the MMPI. J. consult. Psychol., 
1952, 16, 65-72. 


Predictive Value of The Empathy Test in Automobile Salesmanship 


Francis P. Tobolski and Willard A. Kerr 
Illinois Institute of Technology 


Examination of the literature on job success 
in various fields of salesmanship reveals that 
empathic ability measurement has not been 
investigated as a possible useful predictor (1, 3, 
6). In fact, a standardized and validated 
objective test of empathic ability was not 
available until 1951 (4, 5). The absence of 
empathic ability measurements in salesman- 
ship prediction batteries seems a remarkable 
historical oversight by psychologists. On the 
surface at least, it appears fairly certain that 
the success of the salesman usually depends 
significantly on his ability to assimilate and 
anticipate the feelings of others. This study 


tests that hypothesis for the job of selling auto- 
mobiles. 


Subjects and Procedure 


Subjects for this study were sales personnel 
of two of Chicago’s largest automobile agencies. 
Twenty-two (seven new car and fifteen used 
car salesmen) of the Company A sales personnel 
of twenty-five and all ten (four new and six 
used car salesmen) of the Company B salesmen 
participated in the study. 

The instrument used for measuring empathic 
ability was an approximately fifteen-minute, 
completely objective test, The Empathy Test 

Its rationale, explained elsewhere (4, 5), 
hypothesizes that the presence of empathic 
ability may be demonstrated by the ability to 
Predict representative behaviors of normative 
individuals, The Empathy Test applies this 
rationale by requesting the respondent to pre- 
dict the responses of typical individuals in key 
behavior areas—aesthetics (music types), gen- 
eral human interests (what people read), and 
interpersonal relations (annoying experiences). 

Criteria for job success were sales ratio (cars 
sold divided by attempts) over a three-month 
to twelve-month period and sales managers’ 
rank order ratings. Four validity coefficients 
were computed on each of the two criteria 
since each company included salesmen special- 
izing separately in used car or new car selling. 


Results 


Inspection of Table 1 shows a mean (F isher 
z-transformed) rank difference coefficient of 
validity of .44 against the sales record criterion. 
The mean validity coefficient against the sales 
manager rankings is .71. Both of these coef- 
ficients are significant at the 98 probability 
level, or better, according to the Thornton (7) 
tables. Approximately one-fifth of the van 
ance in objective sales records is accounte 
for by variance in scores on The Empathy Test. 

An interesting result is the consistent tend- 
ency for validation coefficients on the salesmen 
of new cars to exceed those of the salesmen of 
used cars. It is difficult to account for this 
difference except, perhaps, as an artifact of the 
differential effects of television advertising © 
new and used cars on the actual final selling 
situation. Both companies employ television 
as their principal medium of advertising. The 
television viewer who is interested in the neW 


Table 1 


A Summary of Validity Coefficients from The Empaths 
Test on Groups of Salesmen Employed by Two 
Leading Chicago Auto Dealers 


Rank Difference 


Salesmen Sales by ee 
Group N Record Manag 
Company A 50 
New car 7 42 41 3 
Used car 15 .05 33 
Total 22 25 7 
Company B 
New car 4 AS. i” 
Used car 6 18 G 
Total 10 58 ‘92 
New car total 11 .61 17 
Used car total 21 .12 s 1 
Total both companies 32 .44 7 


jlit. 
* Coefficients which are reliable at the 95 probabili y 
level are indicated in boldface. 


310 


Value of Empathy Test in Automobile Salesmanship 311 


cars finds them as typically represented upon 
Visiting salesrooms, but the television viewer 
interested in used cars finds that the typical 
Vehicles in the used car lots may differ consider- 
ably from their glossy appearance on television. 
Theoretically, the potential used car customer 
would therefore experience greater frustration 
and disillusionment than does the potential new 
car customer when he examines the actual 
Vehicles. Aggression from this greater frustra- 
tion (2) of the potential used car customer 
Seems most likely to be channeled against the 
nearest plausible object, which is, of course, 

e used car saleman. Thus the used car sales- 
a may find that this customer aggression 
Actor prevents as consistent a sales perform- 
ance as is obtained by a new car salesman of 
equal empathic ability. It is possible that this 
Tustration factor results in the potential cus- 
poe T largely discounting sales contentions and 
uying largely on the basis of other evidence 
and observations, 


Summary aoe 
Thirty-two salesmen of two of Chicagos 
a New-and-used automobile agencies were 
amined on The Em pathy Test and their scores 
Cte correlated with two criteria. 
dict, Empathy Test scores were significant 
°rs of sales records (r = -44) 
Mpathy Test scores were § 


pre- 


ignificant pre- 


dictors of the job success of sales crew members 
as ranked by their sales managers (r = .71). 

3. Empathy Test score predictions of sales 
and success ranking of used car salesmen were 
very inefficient (r’s = .12 and .17). 

4. The superior predictive value of the test 
on new car salesmen is tentatively attributed 
to the differential effects of television advertis- 
ing on the selling operation for new and used 
cars. A frustration-aggression situation is hy- 
pothesized to arise more frequently in the latter. 


Received October 24, 1951. 


References 


. Buros, O. K. Third mental measurements yearbook. 
New Brunswick, New Jersey: Rutgers Univer- 
sity Press, 1949. 

. Dollard, J. L., Doob, N. J., Miller, N. E., Mowrer, 
O. H., and Sears, R. R. Frustration and ag- 
gression. New Haven: Yale University Press, 
1939. 

3. Dorcus, R. M., and Jones, M. H. Handbook of 

employee selection. New York: McGraw-Hill 
Book Co., Inc., 1950. 

4. Kerr, W. A., and Speroff, B. J. The empathy test. 
Chicago 90: Psychometric Affiliates, 1951. 

5. Speroff, B. J., and Kerr, W. A. Validation and 
evaluation of the empathy test. Presented at 
annual meeting, Midwestern Psychological Asso- 
ciation, Drake Hotel, Chicago, April 27, 1951. 

6. Super, D. E. Appraising vocational fitness. New 
York: Harper and Bros., 1949. 

7. Thornton, G. R. The significance of rank differ- 

ence coefficients of correlation. Psychometrika, 

1943, 8, 211-222. 


w 


Predicting Success of Students in Veterinary Medicine* 


Wilbur L. Layton 


Student Counseling Bureau, University of Minnesota 


The enrollment in schools of veterinary 
medicine has increased by approximately 500 
per cent in the last twenty-five years (2). The 
competition for admission to these schools has 
been almost as great as that for admission to 
medical schools. Counselors, teachers and ad- 
Ministrators therefore must have available 
indices predictive of the success or failure of 
students in veterinary medicine. These indices 
are potentially useful for the selection of stu- 
dents and for the counseling of students, before 
and after their admission to schools. 

Recently, a Veterinarian Aptitude Test was 
devised by Owens and Payne of the Iowa 
State College (4, 5) and Hannum developed 
a veterinary medicine scale for the Strong 
Vocational Interest Blank (1). The present 
study was designed to evaluate these and other 
tests and predictive indices forecasting the 
academic achievement of students enrolled in 
the School of Veterinary Medicine at the Uni- 
versity of Minnesota. 


Method 


In the fall quarters of 1948 and 1949, fresh- 
men enrolled in the School of Veterinary 
Medicine at the University of Minnesota were 
tested on the Strong Vocational Interest Blank, 
the Iowa State College Veterinary Aptitude 
Test, and two parts of the Professional Apti- 
tude Test of the Educational Testing Service. 
The Towa Test consists of three parts: read- 
ing comprehension, verbal memory, and pre- 
veterinary achievement in chemistry and 
zoology. The two parts of the professional 
aptitude test used were the following: the 
verbal ability section, yielding verbal ability 
scientific; verbal ability, special; verbal ability, 


* The statistical analysis for this study w: 

by the Bureau of Institutional Research, tions 
Minnesota. Appreciation is expressed to Dr. Robert 
Keller, Director, and Margaret Abernathy, Research 
Fellow, Bureau of Institutional Research, and to Pros 
fessor Martin H. Roepke of the University of Minnesota 
School of Veterinary Medicine for their assistance and 
cooperation in this study. 


humanistic and composite verbal ability scores, 
and the pre-medical science achievement sec- 
tion. 

Grades in chemistry and other physical 
science courses taken as a part of the pre- 
veterinary work were available and combined 
as one predictive variable. Pre-veterinary 
grades in botany and zoology were combined 
in a similar manner. Grades in all pre- 
veterinary medicine courses were combined as 
a third measure of pre-veterinary achievement. 
The ages of the students were also available. 

A total of 100 students were admitted as 
freshmen in the two classes studied. Complete 
data were available for 87 males. When grades 
in the School of Veterinary Medicine became 
available, data for the two classes were com- 
bined and the relationships determined between 
the predictive measures and cumulative honor- 
point-ratios at the end of the first year in the 
School of Veterinary Medicine. The Strong 
Vocational Interest data were analyzed first, 
then selected scales on the Strong Blank and 
the other predictive indices were correlate 
with the criterion, first year grades, and with 
each other. Several combinations of these 
variables were tried out to determine theif 
predictive efficiency. One combination wa 
selected as giving optimal prediction. 


Analysis of the Strong Vocational 
Interest Blank 


Distributions of the letter ratings on 39 
occupational scales and three non-occupational 
scales for the Strong Vocational Interest Blank 
were dichotomized and related to the criterion, 
first year grades. Chi-square tests were zig 
to determine statistically significant relation- 
ships. The following six scales were found t° 
be related to the criterion with Chi-square 
significant at the one per cent level: osteopat)) 
chemist, farmer, aviator, Y.M.C.A. secretary? 
and sales manager. 

The veterinarian scale (1) became available 
when this study was almost completed. 


312 


Predicting Success of Students in Veterinary Medicine 313 


Table 1 


Coefficients of Correlation Between Predictive Indices and Freshman HPR for Classes (N = 87) 
Entering the School of Veterinary Medicine in 1948 and 1949 


Coefficient of 
f Predictor Correlation Mean Taan 
Iowa State Veterinary Aptitude Test 

1. Paragraph Comprehension 21 49.28 5.57 

2. Verbal Memory .03 23.40 4.01 

3. Pre-veterinary Achievement 27 57.53 6.99 

4. Total (1-+2+3, unweighted) -25 130.21 12.16 

Professional Aptitude Test 

5. Verbal Ability, Scientific -22 29.95 9.83 

6. Verbal Ability, Special -00 25.61 9.33 

7. Verbal Ability, Humanistic .03 16.63 8.17 

j 8. Verbal Ability, Total -10 72.20 23.52 

| 9. Pre-medical Science 30 46.28 17.19 

i Strong Vocational Interest Blank 

10. Osteopath 28 43.52 8.95 

11. Veterinarian 30 47.22 8.17 

12. Chemist 22 34.33 8.73 

18. Sarmer 12 50.38 6.84 

14. Aviator .28 40.80 8.04 

15. Y.M.C.A, Secretary 12 23.79 10.29 

i 16. Sales Manager f “A 23.94 6.74 

Pre-Veterinary Grades* 

17. Chemistry HPR F by : a 

1, Zoology and Botany HPR ‘33 (Bh a 

19. Total Pre-Vet. HPR _ 33 38 aie 

20, Age at Entrance to Vet. Med. G T E 

21. Freshman HPR E : - 
“E i ivi y its). Honor poi yi i 
| as raressed ja i honor-pointzranio ty Aa divided by number of credits) points were assigned 
» = ó; > 3 

Stron itandit The first six scales listed above were included 
Was 5 answer sheets were rescored for it aed in the entire correlation study. The veteri- 


ana] bove met! 
A gi, 49 2ed separately by the a l 
Was Mnificant relationship (one pet cent level) 


und between it and the criterion. 


narian scale was included in the final stages of 
the correlation study. 


Table 2 
«ables Used in Trial Multiple Correlations 
Intercorrelations of the Variab 
P.A.T. Ia. 
Pre-Med. Pre-Vet. Pre-Vet. Strong 
y, Strong Strong Age Sci. meh HPR Vet. 
table Osteo. Aviator 20 9 19 11 
2 2 es a 30 27 33 30 
10 .28 .28 "20 15 17 28 27 
14 31 p 2 —.03 18 .06 -08 
20 : —,.05 —.19 = 2 —.12 
9 65 «23 ld 
3 2t =01 
30 


314 


The Correlation Study 


The raw scores or standard scores for each 
of the variables were converted into stanine 
scores. These stanines were punched into 
IBM cards and a table of intercorrelations 
(Pearsonian) was computed. Table 1 presents 
the validity coefficients, mean raw or standard 
scores and standard deviations for all variables. 

Selected variables were tried in combina- 
tion to determine their predictive efficiency. 
Table 2 indicates the intercorrelations of the 
variables used in the various combinations. 
Multiple correlation coefficients were computed 
for these several combinations. These cor- 
relations, taken in pairs, were compared using 
the F test. Table 3 presents the various 
combinations tried, the resulting multiple 
correlation coefficients, and results of the F 
tests. The last combination of variables listed, 
with the fewest number of variables, yielded 
a coefficient of multiple correlation which 
compared well with the larger ones and hence 
appeared best for practical use. 

The three variables combined in the final 
regression equation to predict first year grades 
were: the veterinarian scale on the Strong, 
pre-veterinary total honor-point-ratio, and pre- 
veterinary achievement score on the Iowa 
Veterinary Aptitude test. 

The regression equation is: 


y = 86+ 43 Xı + .23 X + .17 X; 


Table 3 


Combinations of Variables with Resulting Coefficients 
of Multiple Correlation and Levels of Signifi- 
cance of the F Tests of the Difference Be- 
tween Pairs of Multiple Correlations 


Level of 
Coefficient Signifi- 
Variables in of Multiple cance of 
Combination Correlation the F Test 
10, 14, 20, 9, 19, 11 6537 
10, 14, 20, 9, 19, 3, 11 ‘6678 <.05 
10, 14, 20, 3, 19, 11 .6506 05 
10, 14, 20, 9, 3, 19, 11 6678 <.05 
10, 14, 20, 9, 3, 19, 11 6506 
10, 3, 19, 11 6134 à 
10, 14, 20, 3, 19, 11 .6506 F 
3,19, 11 6017 05 


Wilbur L. Layton 


Table 4 


Conversion of Scores to Stanines 


Strong Towa 
Fresh- Vet. Pre-Vet. 
man Total Scale Achieve- 
Year Pre-Vet. Standard ment À 
H.P.R.* H.P.R.* Scores R.S. Stanine 
3.0 2.5-2.8 61-72 68-70 9 
2.5-2.9 2.3-2.4 57-59 66-67 8 
2.1-2.4 2.1-2.2 52-56 63-65 7 
1.8-2.0 1.9-2.0 50-51 60-62 6 
1.5-1.7 1.7-1.8 44-49 57-59 5 
1.3-1.4 — 42-43 52-56 4 
1.11.2 1.6 36-41 47-51 3 
1.0 1.5 32-35 44-46 2 
AD 1.1-1.3 28-31 40-42 1 


* Expressed as an honor-point-ratio (honor-points 
divided by number of credits). Honor points were 
assigned as follows: A = 3, B = 2, C = 1, D,F = 0. 


where 


y = Predicted honor-point-ratio, first 
year in Vet. Med. 

X, = Total honor-point-ratio in pre- 
veterinary courses, 

Xə = Veterinarian scale on the Strong. 


Xs = Pre-veterinary achievement test 


score of the Iowa Test. 
R= .60. 


Y, X1, Xə, and X; are expressed in stanines- 
Table 4 presents the conversions of raw OT 


standard Scores into stanines for these four 
variables, 


Discussion 


The correlation of .53 between total pre- 
veterinary H.P.R. and freshman grades is 
somewhat higher than that reported by Owens 
(3). However, the correlations found for 
chemistry H.P.R, and zoology and botany 
H.P.R. are almost identical with those reported 
by him. Owens reported relatively high valid- 
ity coefficients (r’s ranging from .48 to -7 
for the Veterinary Aptitude Test, In the 
present study neither aptitude test correlated 
highly with the criterion. In fact the selected 
interest scales correlated about as highly. 

This is one of the few instances known to the 
author in which a scale on the Strong has ma! e 
a significant contribution to a regression equ 
tion predicting students’ grades. The Schoo 


Prediciing Success of Students in Veterinary Medicine 


of Veterinary Medicine at the University of 
Minnesota is a newly established one. The 
two classes studied were the second and third 
classes admitted to the school. The faculty 
has given a great deal of personal attention to 
these students. One might hypothesize that 
this situation is responsible for the relationship 
between interests and grades. 


Summary 


Twenty variables were studied to determine 
their effectiveness in predicting grades earned 
Y first year students in veterinary medicine. 
otal honor-point-ratio in pre-veterinary course 
Work, the veterinarian scale on the Strong 
°cational Interest Blank and the pre-veteri- 
naty achievement test of the Iowa State College 


315 


Veterinary Aptitude test were combined in 
the final regression equation and yielded a 
coefficient of multiple correlation of .60. 


Received July 7, 1952. 
Early publication. 


References 


1. Hannum, T. E. Response of veterinarians to the 
Strong Vocational Interest Blank for Men, 
Proc. Ia. Acad. Sci., 1950, 57, 381-384. 

2. Klussendorf, R. C. Education in veterinary medi- 
cine. Higher Educ., 1949, 5, 181-185. 

3. Owens, W. A. Development of a test of aptitude 
for veterinary medicine. Proc. Ia. Acad. Sci., 
1950, 57, 417-423. 

4. Owens, W. A., and Payne, L. C. The Towa State 
College Veterinary Aptitude Test. Ames, Iowa, 
1950. 

. Owens, W. A. An aptitude test for veterinary medi- 
cine, J. appl. Psychol., 1950, 34, 295-299. 


on 


The Criterion Problem in the Prediction of Medical School Success* 


Eugene L. Gaier 


University of Illinois 


The current study was initiated for the pur- 
pose of investigating the criterion used in the 
professional medical training program. The 
specific purpose of the present paper is to review 
the procedures used to evaluate performance 
in one medical school and to determine the 
adequacy of these procedures in terms of the 
objectives of medical education both from a 
logical and a statistical point of view. 

A basic definition here is that of a criterion 
as a set of symbols used to describe the per- 
formance of individuals on a success continuum 
(4). The measure of success emphasized in 
medical training has been the average of four 
years’ marks on the assumption that the marks 
or grades assigned represent the extent to 
which the objectives of the medical school 
have been achieved. The characteristics of 
such a criterion might be grouped for purposes 
of discussion as: (1) relevance or pertinence; 
(2) comprehensiveness; (3) reliability and dis- 
crimination; and (4) freedom from bias. The 
first two of these characteristics involve judg- 
ment as to the adequacy with which the pro- 
posed success measures represent performances 
on the defined dimension of success. The 
third and fourth categories deal with the more 
formal aspects common to any set of psycho- 
logical measuring techniques. These aspects 
are evaluated in terms of the assigned marks 
as indicators of the level of performance. 


Procedure 


The records of the students of two classes 
who entered the Medical College of the State 
University of Iowa were used as the data for 
this investigation. They were selected as being 
two representative classes of this school, and 
will henceforth be referred to as Class I and 
Class II. 

Class I. School records for ninety indivi- 


* This study is a condensation of a Master’s thesis 
submitted at the State University of Iowa, The author 
wishes to express appreciation to Dr. Harold P. Bech- 
toldt under whose direction this study evolved, 


uals, 5 women and 85 men, were available. 
The ages ranged from 20 to 34. Thirty-five 
members (38.8 per cent) of this class completed 
all of their premedical training at the State 
University of Iowa, while 26 (28.8 per cent) 
of the students had done part of their premed- 
ical training at this institution. Sixty-six cases 
(73.3 per cent) completed the four year medical 
program and received the M.D. degree. i 

Class II. Records for this class were avail- 
able on 4 women and 83 men, for a total of 87 
cases. Nineteen members (21.9 per cent) of 
this group completed some of their premedical 
education at the State University of Iowa 
while 38 (43.6 per cent) of the members com- 
pleted all of their premedical training at this 
institution. Seventy-four (84 per cent) of the 
students completed the four-year curriculu™ 
and received the M.D. degree. i 

For each student, data on the following 
items were punched on IBM cards: age, 58% 
premedical school, premedical grade point 
average,' national percentile rank on the Moss 
Aptitude Test,? year and reason for leaving 
medical school, the letter grade received for 
each of the 42 courses in the four-year medicà 
curriculum, and the grade point average for 
each year of medical work. | 

The undergraduate course record as We 
as the four yearly weighted grade point aver 
ages were available for each student. Credits 
of four, three, two, one, and zero per unit hour 
were assigned to grades A, B, C, D, and | y 
respectively, in determining a total grade pow 
for each student. Each grade was assign’ 
an additional weight in terms of the number ° 
clock hours in the course. 


Results 


The analysis of the data on the criterio” 
measures will be presented under the follow” 


g ill 
‘This coded premedical grade point average wi 


henceforth be called the premedical average grade. ge 
? The coded national percentile on the Moss Aptitud 
Test will henceforth be called the Moss Aptitude SC? 


316 


Criterion Problem in Prediction of Medical School Success 


four } z 
' headings: (1 Be ie a 
grades: (2) ae (1) frequency distributions of P 
e in ercorrelations of eleven selected VlIIetse ys 
es; (3) reliability of th i = yos z 
and (4) o y he assigned grades; 5 5 
prediction of medi 2 a ra Z| nma o p= 
i nedical school s S gsxzrgzaol E 
. Frequer aE uccess. g SaB253| 5 
quency icy Distribution of Grades. Fre- fa sae Sl 
ais: a percentage distributions of course jz 2 
Table ‘a two classes are presented in 3 S| RASS5 |e 
- For bo a are = Dai 5 
of grades diveni n classes, the distributions zlssg ay 2 
Bret veer che n the last three years of work SSsssg] Š 
t ry similar (x? = 6.61, P y z Sia 
y combined distributi Ol 4 >.10 for the Z 2 
of freed butions, with four degrees ne 5 
$ om). T ii e m is) s 
given dosing ioe = distributions of grades k Lanasan $ 
differences dignif reshman year, however, show 2 2 Jgs = 
bi & 49 cant at the one per cent level 5 g ylogs 3 
f .14, P<.01 with 5 ə AS E ka 3 
reedom) , 5 with four degrees of iA > Sa S| 2 
$ oc ls s 
Whil se ie 
ec. af rm A = o & 
mes IIhad fewer students eliminated, SE Je Se S Sse |a 
degrees j ence in the proportions securing BH e = z 
per a ge two classes (73.8 per cent and 5g ZISARESS E 
n A : ei ay 31.8 
at the five , respectively) is not significant 95 a 
Stmilarit ie cent level of confidence. The zy 5 
ast e of the grade distributions for the së saes 7g & 
ifference years together with the insignificant Bo =] S $39 z 
classes du in the number of failures in the two Da ils zlłsszog 2 
. n= Ge) = 
e ring those years suggest either that <n |> = S| 5 
Performa a able = = 
OUr years ances of the two classes over the > $3 IS 2 
Of reas are comparable or that the standards 2 £5 |e SIST Bas z 
Stading S S O'S Sugacs S 
@ecommod may shift from year to year to Hues = = 
ance level ate differences in general perform- Za Z\BSRROR 
nations ca Although these alternative expla- aS 2 § 
etse E nnot be directly evaluated without ro & 
cen : in 
ests comparable, objective achievement £D oluo na = 
dime Some inf Poni EE Elsacas |2 
erences į ormation can be secured from the EEM g Ss | 2 
Variables 5) the mean scores on the predictor es |S z | om oo] k 
ing p, cS and i 28 |> sg32ceg|g 
8 the first the average grades assigned dur- Az |< > aS 
St year oo ls E 
Ne di ' oe |3 Llzgzgee E 
i z Neg 
two an, Perennes between the means of the 2 2 a) SgS g 
pice 5 5 
Pte Ga eee significant at the one per Be o\genmue 5 4 
a the f l = 2.58, degrees of freedom > 100) = aS al = E 
th logy Psi four variables: anatomy, E E m 
. 5 vv = 
me, Moss t he premedical average grade, and > Seea |i E 
aa of Fo scores. The differences in the G 4 Fesen | F 3 
tne on Moss test and the premedical gs |e oe ag 
i : t FE 8 
in a Sentin average can be considere as > aat k 
: Ss S 
Th he foe a difference in initial performance A ahaa & £ 
. reat rN AN z 
Dee” ar Sr represented by these variables. Ca ass e š È 
Meqi a the scores on the Moss and the ills age ie. 
ne Srade averages represent compat- 4 SBS! yB 
meq 288m e 3 a $ 
amas ag the pon of homogeneity of variances Te o Ej 58 
idelles As establice of significance of differences M 4s e238 
Neg” ĉl bette ished for all variables over the two Bs l a ga 
r than the five per cent level of Com BE/am0AR 4) s Be 
Ao B 
z 


318 


able standards of evaluation for the two classes, 
Class II can be considered the more promising 
of the two groups. 

If only the less capable students have been 
rejected, then the premedical averages and 
the Moss scores (see Tables 2 and 3) of the 
students receiving the M.D. degree should not 
differ significantly. The differences between 
the mean values of these two variables are 
significant at better than the one per cent 
devel of confidence. The groups receiving the 
M.D. degree, therefore, cannot be considered 
homogeneous with respect to mean scores on 
the Moss and the premedical grade point 
averages. Furthermore, for these selected stu- 
dents, the correlation coefficients between the 
two selection variables and grades received in 
the five courses of the freshman year are not 
significantly different for the two samples, 
16 of the 20 coefficients being significantly 
different from zero at the five per cent level 
(N = 66, r = .24). One of the four non- 
significant correlations (r = .19) was between 
the Moss and the histology grades for Class II. 
It would seem reasonable to conclude here that 
the factors represented by the two selection 
variables were about equally important for 
success in the five freshman courses. Since the 
significant differences in course grades were 
confined to two courses and since the significant 
differences in the selection variables were not 
eliminated by rejecting failing students, the 
grading procedures apparently were not oper- 
ating to eliminate the students with the lowest 
Standings on these variables. 

Inasmuch as the only significant difference 
in the average grades assigned the two groups 
for each of the four'years was for the first year 
average, the data are considered as supporting 
the hypothesis that these differences in initial 
ability were not responsible for the differences 
in the mean scores of the five first year courses, 
In other words, the results are interpreted as 
indicating that the standards of evaluation 
used for the two groups probably were not 
strictly comparable, 

2. Intercorrelation of the Variables. The 
complete correlational analysis was restricted 
to eleven variables: five freshman courses, the 


4 Detailed analysis of separate course grades could be 
justified only for five of the six freshman courses; these 


Eugene L. Gaier 


average grades for each of the four years of 
medical school, and the two predictor variables. 
The data on five of the six freshman courses 
were included to provide an estimate of the 
reliability of the first year grades in predicting 
relative success in the last three years of medi- 
cal school. 

Thirty-seven of the correlations for Class I 
and 42 correlations for Class II (50 correlations 
were computed for each class) were signifi- 
cantly different from zero at the one per cent 
level. Since the correlations were all zero 
or positive, the hypothesis that a single, 
common factor would account for the inter- 
correlations® of ten variables, excluding first 
year average grades, was investigated. „The 
hypothesis was tested in terms of the magnitude 
of the difference between the Z’s corresponding 
to the observed and predicted correlations. 
The predicted correlations were computed from 
the cross product loadings on the single common 
factor by the formula rj, = ajiaz;. Since this 
procedure dealt with the discrepancy betwee? 
a theoretical and observed value of a Z, the 
standard error of the theoretical Z was defined 

1 


VN -—3 

Significant residuals between the observed 
and theoretical Z’s (five per cent level) were 
found for three pairs of variables: (1) second 
vs. third year average grades; (2) anatomy VS 
physiology grades; and (3) biochemistry V 
physiology. Only one residual corresponding 


as gz = 


t 
courses accounted for eighty-five and fifty-eight pet cor 


of all the failing marks assigned to Class I an 
respectively. from 

The correlations which were not significant ere: 
zero although at the five per cent level for Class I Ti 
physiology with anatomy, fourth year average uri 
anatomy, premedical grade point with anatomy, : 
year average with histology, premedical grade 
with histology, neuroanatomy with physiology, 
anatomy with second year average, neuroanatomy dical 
third year average, neuroanatomy with preme con 
grade point, neuroanatomy with the Moss test, aay e 
year average with the Moss test, third year ae the 
with the Moss, and the fourth year average Wit ere! 
Moss. Non-significant correlations for Class II Whe 
anatomy with fourth year average, anatomy W! Jog! 
Moss, histology with premedical grade point, histo og 
with the Moss, biochemistry with the Moss, phys! th 
with the Moss, neuroanatomy with the Moss, th 
year average with premedical grade point averag® 
fourth year average with the Moss test. 

ê The factor loadings were computed by Thurs' 
direct summation method for the special case © 
rank (5, p. 227). 


15 
e 
tonti 


| 


} 
i 
l 


Criterion Problem in Prediction of Medical School Success 


319 


Table 2 


Means, Standard Deviations, and Intercorrelations for Eleven Variables for the 
Iowa Medical College Class I (N = 66) 


Code Variable Mean S.D. 1 2 3 4 5 6 4 £ 2 19 1l 
Fi Anatomy—first year 2.2 Si = S57) A AO 46 G76) A9 .32 u25 Da Gp 
OR Histology—first year 21 $7 S7 = 48 5 36 (72) 50 34 30 2 a 
3 Biochemistry—first 3 
Väo 21 83 42 48 — 58 50 (82) 50 48 40 52 39 
4 Physiology—first year 20 65 16 35 58 — .21 (67) 28 .24 32 24 28 
5 Neuroanatomy—first 
a Year a 67 AGE SO) 280) BL GH) GSO AS SOB RZ 
6t First year average 
peg Stade (six courses) 2.1.46 (.76)§ (72) (82) (67) (57) — 61 49 41 45 39 
Second year average x 
„ „grade 22 42 50 50 50 .28 St 61 — 71 34 32 129 
8* Third year average 5 
Dance z4 45 34 34 48 .24 50 49 71 — AY 35) 128 
9 Fourth year average 
tak 26 32 25 30 40 32 43 41 34 47 ED 
Ot Premedical gp. 5 
s average 35 1.78 23 (27 :52 .24 38 .45 432.55 AE SR 
1 Percentile rank on 2 
Moss Aptitude Test 5.2 251 32 44 .32 28 33 39 29 28 21 38 — 
val The average grade represents a weighted average based on the number of clock hours per course and an assigned 
Ue for eg 
Th cach letter grade. rages coded as single digit scores by a linear transformation. 
ag tude Test coded as single digit scores by a linear transformation. 


© Premedical gra int ave h 
t medical grade point a See Moss Apti 


ena il k \ : 
S Th nl Percentile mik rare not experimentally independent and should be disregarded. 


© correlations in parentheses 
to . 
the correlation between second and third 


* average grades was significant at this level 
ese results 


Reith ndence for Class II. Th ; 
for, «2 Clearly sustain nor refute the hypothesis 
Ta si be considered 


Equivas factor; the results can 
Clean the hypothesis of a single factor Js zot 
Q E sustained, the possible influence ©: 
of ta in the course content on the pattem 
Veg  Otrelations (see Tables 2 and 3) a My 
The 8e grades might account for other fac op 
last OW correlations between the gise pe 
judon 2 Years may reflect the unreliability : 
Hoy cients of performance in clinical W : 
neren i the correlations of -71 and ./s 
tadi en the second and third year grades woul 

inica, that the reliability of the third ‘ha 
f the *atings was perhaps comparable to thé 

"liah j; Cond year marks. Therefore, if lo 
latig ty is the major factor in the low be 

5 of the last two years, the unreliability 
PPear to be concentrated in the s™ c 

lor year, 


Although the pattern of intercorrelations 
among the average grades for the four years is 
not consistent with the predicted values, the 
fact that the correlations are positive and 
significant does indicate a degree of consistency 
necessary for an acceptable criterion. The 
series of grades included here, in fact, might be 
considered as arising from the presence of a 
single, common factor. Further support for 
the hypothesis is provided by the fact that 
the relations are consistent for both samples 
studied. 

3. Reliability of the Assigned Grades. The 
reliability of a composite first year grade, 
defined as the simple sum of grades in five 
courses was estimated on the assumption that 
a single, general factor was sufficient to account 
for the intercorrelations of the grades. This 
assumption was tested: (1) by determining the 
factor loadings for the special case of unit 
rank? from the intercorrelation of the five 


1 Thurstone’s (5) direct summation method of evalu- 
ation was used. 


320 Eugene L. Gaier 
Table 3 
Means, Standard Deviations, and Intercorrelations for Eleven Variables for the 
Iowa Medical College Class II (N = 74) 

Code Variable Mean SD. 1 2 3 4 5 6 7 8 9 10 H 
1 Anatomy—first year 27 62 — 32 5I d4 A5 (82) 47 42 28 45 i 
2 Histology—first year 2.8 49 32 — 33 33 .32 (50) .30 37 35 .25 
3 Biochemistry—first 

year á 23 78 51 33 — 6&4 53 (81) .53 46 48 M Ai 
4 Physiology—first year 2.3 78 44 .33 64 — 49 (.80) .5§ 43 44 SI 
5 Neuroanatomy—first S | tt 
year 24 64 45 32 53 49 — (64) 55 50 43 33 
6* First year average as 4 37 
grade (six courses) 2.5 51 (.82)§ (.50) (.81) (.80) (64) — 64 55 49 37 
7* Second year average D 
grade 23 40 .47 30 53 .58 55 64 — 735 33 44 
8* Third year average h 
grade 24 42 42 37 46 43 50 55°75 — 47 Do 
9* Fourth year average 
grade 24 32 28 35 48 M 4&3 49 B a SND 
10} Premedical g.p. 2 
average 43 159 45 25 44 31 33 37 44 43 20 — 3 
11} Percentile rank on 
Moss Aptitude Test 5.9 2.23 .27 19 21 24 1 37 43 42 19 35 — 


* The average grade represe 
value for each letter grade. 


į The premedical grade point averages coded as sin, 


t The national percentile ranks on the Moss Aptitude Test coded as sin 
§ The correlations in parentheses are not experimentally, i 


freshman courses; and (2) by determining the 
significance of the difference between the 
observed and predicted Z values. None of the 
differences was significant at the five per cent 
level; the assumption, therefore, could be 
accepted. 

The reliability coefficient of the grades as- 
signed was estimated by two procedures. In 
the first estimate, the analysis of variance 
method formulated by Burt (1) and Hoyt (2) 
was employed. The error variance is defined 
as the interaction of residual variance. Since 
the variances between individuals were sig- 
nificant for both sets of data beyond the one 
per cent level (F; = 18 48 and 23.38 for Classes 
I and II, respectively) ® the reliability coeffi- 
cient can be considered significant. Both of 
these reliability coefficients were estimated? 


8 F, is defined as the variance between individuals 
divided by the variance for interaction, 

? 7j; is defined as the ratio of the difference between 
the variance between individuals and the error variance 
to the error variance. 


, ioned 
nts a weighted average based on the number of clock hours per course and an assign 


gle digit scores by a linear transformation. ‘oni 
gle digit scores by a linear transformation 


independent and should be disregarded. 


as .95. This procedure also provided a test of 
the significance between the means of course 
for each group (Fs = 4.94 and 12.88 for Classe 
I and II, respectively).!° Both F ratios wer 
significant at beyond the one per cent v % 
and the variations between mean grades of or 
courses for each group are greater than tho 
expected by chance. ed 
The second estimate of reliability was at) 
by the Kuder-Richardson formulation the 
The highest of the four intercorrelations of J 
freshman course grades should provide @ oe 
servative estimate of the reliability of jia- 
separate grades. By this method, the ae 
bility coefficients for the averages were .89 ™ 
-90 for Class I and Class II, respectively: ity 
can be concluded, therefore, that the reliabili 5 
coefficient probably lies between .89 and 
for the average of the first year grades, & V2 


ees 3 ity: 
that indicates a satsifactory degree of stabilit” 
es 


jue 


P urs 
1° Fə is defined as the variance between © 
divided by the error variance. 


| 


y 
| 


Criterion Problem in Prediction of Medical School Success 


The resulting lower intercorrelations between 
the first year average grades and the other 
Variables could not be attributed to the unre- 
liability of these marks. It may be concluded, 
therefore, that the first year marks do exhibit 
a desirable degree of discrimination and relia- 
bility. The obtained correlations (.61 and .64) 
of first and second year grades are significantly 
lower than those reliability coefficients. The 
fe ation of these correlations is probably 
rather « 3 the unreliability of the marks, but 
tives in ‘t actors such as the shift of the objec- 
Urement ne course instruction or in the meas- 
to instr of these objectives from instructor 
uctor, 
eee ion of M edical School Success. The 
terion g analysis has indicated that the cri- 
Some Sr wha may be considered as having 
Telia) ilit mite factor in common and that the 
e S and the discrimination capacity of 
ionshj + me’ grades were adequate. The rela- 
hese a Of two possible predictor variables to 
investigata S of medical school success w a 
Were the mil The selection variables = 
t € Dere Premedical grade point average, and 
The an ntile rank on the Moss Aptitude Test. 
„~ ysis included the computation of the 
e ation é correlations and the multiple cor- 
the ) or all cases in each class receiving 
Were dete degree. Point biserial correlations 
rmined for all members of the fresh- 
ass; the criterion was “pass-fail” in 


Mer 
edica] school. 
e3, l 
Premegi oorder correlations between the 
Utes (Cal grades and the four criterion meas- 


of m i? first, second, third, and fourth years 
tai a „averages for the selected cases) are 
ned in Tables 2 and 3. For Class I, the 
Was „} relation of .45 with the above criteria 
tained between the premedical grade 
“verage and the first year average grade 
Erades icine, The correlations of premedical 
Were bent the second and third year average 
‘Sher for Class IT than for Class I (44 

for Class IT second and third year 


he differences in the correlations 
t statis- 


n> S 
The S@ficant as noted above. fi 


X etw P ani 
mbin tween the criterion measures z 
Scores on the Moss and the P 


321 


medical average exceed the larger zero-order 
correlations by at least .05 for five of the eight 
coefficients. For Class I, the multiple o 
relations for the first three years are significant. 
at the one per cent level. Prediction of fourth: 
year achievement by the multiple correlation 
technique was not improved over that from 
the zero-order correlation value of .43. The 
combined use of the Moss and premedical 
grade point may be recommended when the 
selection ratio employed is low, but the effec- 
tiveness of the prediction of rank in class is 
low. Average grades for Class II can be 
predicted more closely by a combination of 
the two predictor variables during the second 
and third years; and on every level, a combina- 
tion of the two variables results in a greater 
value than the zero-order correlation by itself 


Summary 


This study has been concerned with the 
characteristics of the measures employed as 
the criterion of success in one medical school. 
With respect to the formal aspects of the 
criterion, the findings may be summarized as 


follows: 

1. The consistency of the distributions of 
grades for the two classes were not of compara- 
ble ability and achievement. The conclusions 
reached were that either (a) the differences in 
the measure of ability were reflected in, or 
pertinent to, only two of the freshman courses; 
or (b) the standards of evaluations used varied 
from class to class. The second of these 
alternatives was considered more tenable. 

2. That the intercorrelations found between 
eight measures of medical performance (grades) 
and two measures of premedical achievement 
(premedical average grades and the Moss 
Aptitude Test) could be accounted for by a 
single variable or common factor was equivocal, 
since these differences between the observed 
and predicted Z values were significant at the 
five per cent level. 

3. The hypothesis of a single common factor 
for the five freshman courses was sustained. 
These five courses can then be considered as 
measuring some single hypothetical variable, 
and would be interpreted as a complex function 
of ability, motivation, and work habits. 


322 


4. On the assumption of a single, common 
factor, two different estimates of the reliability 
coefficient of the average course grades were 
obtained: .89 and .95 for Classes I and II, 
respectively. The more conservative estimates 
were based on the Kuder-Richardson formula- 
tion with the separate course reliabilities esti- 
mated as the highest intercorrelations. The 
higher estimate was secured from an analysis 
of variance procedure. These estimates indi- 
cated that the freshman grades possess a 
desirable degree of internal consistency. 

5. The relations between relative standings 

on two measures of ability (premedical grade 
point average and the Moss Aptitude Test) 


Eugene L 


L. Gaier 


and success in the freshman courses were 
generally low but of comparable magnitude. 


Received October 17, 1951. 


References 


1. Burt, C. The reliability of teachers’ assessments of 
their pupils. Brit. J. educ. Psychol., 1945, 15, 
80-92. ; 

2. Hoyt, C. Test reliability attained by analysis of 
variance. Psychometrika, 1941, 6, 153-160. 

3. Kuder, G. F., and Richardson, M. W. The theory 
of the estimation of test reliability. Psycho- 
melrika, 1937, 2, 151-160. 

. Stuit, D. B. Personnel research and test development. 
Princeton: Princeton Publishing Co., 1947. 

5. Thurstone, L. L. Multiple factor analysis. 
cago: Univ. of Chicago Press, 1947. 


> 


Chi- 


eee s > a ee, eee 


j 
\ 


a ig a ee ee 


A Comparison of Three Criteria of Aircrew Effectiveness 
in Combat Over Korea* 


John K. Hemphill and Lee B. Sechrest 


Personnel Research Board, The Ohio State University 


This Paper reports a study of three criteria 
of the performance of 94 B-29 aircrews which 
€W combat missions over Korea during the 
ere extending from March to September 
(1) ~- The three criteria to be considered are: 
ees by superiors of the performance o 
To VS as units; (2) sociometric nominations 
™ crew members; and (3) objective records 
brief ae bat bombing accuracy. Rolleying a 
Will þ €scription of each of these criteria, vl 
and -C Compared in terms of their reliability 
the net trelationship. A general problem Ma 
evides of superiors’ ratings as criteria F Sp 
Of s RADY the apparently paradoxical ni : g 
i wbstan tial and statistically significant re'a- 
R o 'PS between (1) bombing data, phe 
no reliability, and (2) reliable superior: 


tati A z 

real f _This paradox is more apparent Pe 
r it can ¿plained in terms 

ami Paia informa- 


i . ” 
tion Minating effect of “unreliable 
shared in common by the raters. 

4 Superiors’ Ratings 4 
ron < aircrews were rated as units by ee 
X wing staff officers in terms of i 
Over y DCE in carrying out combat see 
thoy tea. The ratings were accomp ‘oven 
the use of an eleven-item rating 


e ; : 3 
ven rating variables were: 


+ She asic 
Kno til as Technicians: The degre al ri 
3 “dge crew members have of their sper 

B ing of 
the Apd On a paper presented at the 1952 eae 

erican Psychological Association. oio State 
niveri ted under a contract between fuman Re- 

e; esearch Foundation an nt 
Pt thea Research Tabora loren (HRRL), facie 
ab orce, ‘The data were collected 

Y Lt. Col. Fred E. Holdrege a” 
ria Or ol. Fred E. thors 
Uh to S Tepresentatives of HRRL- Thè tetro, 
hief Express appreciation to Mr. Lug’ 


Ini? Pe ; RRL, who 
bre teq ‘sonnel Research Directorate, i Inter- 


tati Plans for the study of these crew a 
anthorg ee and opinions nel here are those i a 
Bement osh ded as having 


the senior 


Should not be regar 
En: 
t of the Air Force. 


ties as indicated by their performance or the 
degree of skill they exhibit in handling various 
equipments. : 

2. Successful Completion of Missions: The 
degree to which the crew reaches and bombs 
prescribed targets; including making necessary 
decisions in the absence of specific instructions 
and overcoming obstacles. 

3. Accuracy in Bombing Targets: The accu- 
racy with which targets are identified and 
bombed. 

4. Effectiveness of Crew Leadership: The de- 
gree to which the aircraft commander organizes 
the crew to facilitate teamwork and coopera- 
tion among crew members. 

5. Consideration of Men on the Crew for One 
Another: The extent to which crew members 
look out for the welfare of the crew as a whole, 
are liked by other men on the crew, and turn 
to one another as friends. 

6. Effectiveness in Working with Other Crews: 
The degree to which the crew works as a part 
of a larger team and cooperates with other 
crews in carrying out a group effort. Í 

7. Effectiveness in Working with Superior 
Officers: The degree to which the crew accepts 
orders or suggestions from superior officers and 
achieves objectives without conflict with su- 
periors. s s À 

8. Care of the Aircraft: The degree to which 
the crew members insure proper maintenance 
of their aircraft and take personal interest in 
the plane and its equipment. : 

9. Following SOP: The degree to which the 
crew members carry out their functions in the 

rescribed manner. 

10. Military Bearing of Crew Members: The 
degree to which members of the crew “conduct 
themselves in a military manner. 

11. Over-all Value to the Squadron (Wing): 
The degree of over-all effectiveness of the crew 
as a part of combat unit. 


Ratings on the items or variables were ex- 

ressed as numerical values along a nine-point 
scale. Each point on the scale was defined, 
nine being the rating given to “undoubtedly 
the best crew in the squadron” and one being 
the rating given to “undoubtedly the. worst 
crew in the squadron.” All ratings were ob- 
tained in interviews with the raters. 


323 


324 


The original plan was to secure a minimum 
of five independent ratings for each crew. 
This plan proved to be impractical due to the 
difficulty of locating raters who knew the 
crews sufficiently well to rate all variables. 
A total of 24 wing and squadron officers were 
utilized as raters in securing the ratings of the 
94 crews. The mean number of ratings per 
crew actually obtained was 2.7 and the mean 
number of crews rated by each rater was 10.5. 

An examination of the means and standard 
deviations of the ratings obtained from each 
of the 24 raters showed marked differences in 
their rating habits or bias. Before the ratings 
given a single crew by different raters were 
combined to form the final rating of the crew’s 
performance, an adjustment was made to com- 
pensate for the observed rater bias. The 
adjustment was computed in such a manner 
that each rater’s mean rating was approxi- 
mately 50 and his standard deviation approxi- 
mately 10. 

The reliabilities of these converted ratings 
were estimated by application of a method 
developed by Horst (1) that is designed for the 
case in which varying numbers of raters are 
available. Intercorrelations of the eleven crew 
performance variables were also calculated. 
These intercorrelations along with the relia- 
bilities, means, and standard deviations of the 


John K. Hemphill and Lee B. Sechrest 


eleven variables are presented in Table 1. The 
reliability of these ratings ranges from .61 to 
95 and meets standards which are generally 
acceptable for rating data. 

The intercorrelations of the rating variables 
were factor analyzed and the final residual 
matrix and rotated factor loadings are pre- 
sented in Table 2. Intercorrelations among 
the rating variables are relatively large. How- 
ever, in addition to the large general factor, 
four additional factors are needed to account 
for the correlations. The five factors identified 
were: 


I. General Reputation (Loadings on all vari- 
ables); II. Objective Achievement (Loadings 
on 1, 2, 3, 4, 7 and 11); III. Technical Per- 
formance (Loadings on 1, 3, 9, 11); IV. Atten- 
tion to Equipment and Procedures (Loadings 
on 8 and 9); and V. Social Performance (Load- 
ings on 5 and 6). 


Sociometric Nominations 


On a sociometric nomination form, crew 
members were asked the following question: 
“Tf you could make up a crew from among the 
crew members in your squadron, whom woul 
you choose for each crew position?” ‘There 
were three general possibilities of reaction t° 
the nomination question: (1) nomination of 2 


Table 1 


The Reliability and Interrelationships of Eleven Variables of Aircrew Performance Rated by Superiors 


Note: N 


= 83 


Intercorrelation* 


Standard 

Variable 1 2 3 4 5 6 7 8 9 10 11 Mean Deviation 

1 Technical Skill 49.6 8.21 
2 Completion of Missions 84 513 7.64 
3 Bombing Accuracy 74 68 47.7 9.65 
4 Leadership 83 81 63 49.3 ott 
5 Consideration 62 66 56 70 50.3 10 
6 W.W. Other Crews 69 69 63 74 73 49.6 Te 
7 W. W. Superiors 79 77 66 81 70 76 50.8 7.66 
8 Care of Aircraft 66 64 47 63 61 67 56 50.1 ans 
9 Following SOP 80 73 62 78 59 68 75 66 50.2 7.74 
10 Military Bearing 68 59 60 63 60 58 71 49 71 48.2 7.23 
11 Over-all Value 85 78 73 82 67 78 78 61 82 62 47.4 10.25 

Reliability 80 67 83 80 95 63 80 67 72 61 68 


* Decimal points have been omitted from these tables. 


1 


Three Criteria of Aircrew Effectiveness in Combat Over Korea 


325 


Table 2 


The Final Residual Matrix and Rotated Factor Loadings Resulting from Factor Analysis of the 
Intercorrelations of the Eleven Crew Performance Variables 


Rotated Factor 


Residual* Loadings* 
Variable 1 2 3 4 5 6 7 8 9 10 11 I [II I IV V pr 
1 Technical Skill 68 35 52 25 00 92 
2 Completion of Missions 03 68 65 20 05 10 94 
3 Bombing Accuracy 00 04 56 20 50 10 20 65 
4 Leadership —03 03 —08 69 30 40 30 15 84 
5 Consideration —01 —01 —03 00 67 20 10 20 50 79 
6 W.W. Other Crews 00 OL 01 00 —03 67 20 20 25 40 75 
7 WW. Superiors 00 00 —05 02 —02 03 67 30 40 10 30 80 
8 Care of Aircraft 00 02 00—06 03 04 —01 75 15 00 35 00 71 
9 Following SOP —06 06 —03 —04 —04 —02 03 —07 70 00 40 55 00 96 
10 Military Bearing 04 01 04—02 —01 —04 07—06 06 65 10 25 15 20 56 
Over-all Value —01 06 01—01 02 08 00—03 —01 03 69 20 50 25 10 84 
* Decimal points have been omitted from these tables. 


c 
crew member who was 4 member of the i 
individ, the nominator; (2) nomination € 
ron. Ual from some other crew in the squad- 
i no responses. An individual 
Y indicated an “on-crew” choice by re- 
mg “same,” “my own crew,” OF by 
the name of a fellow crew member in 
Provided, “Ofi-crew” choices were 
n Rot by responding with the ae. 
remarks on the individual’s crew, and PY a 
E as “Captain Smith’s radio opera or, 
“nat except the one we have.” i 
€ Sociometric nomination data for ach 
ste used to compute an index of “on- 
Choices. The index is the ratio between 
“mber of “on-crew” choices made and 
ex val number of choices made. Fee 
: okin ues ranged from .30 to 1.00 an bs F 
Mean of" ny normally distributed abou 


Usual] 


Writin 
the 


D or, bene er index 
of «order to test the reliability of Ota 


on~ 
ftom ha de choice, index values ma 

‘ nomi- 
ation, dom halves of each crew total other 
b U8 cor Were correlated with one “ended 
y ppo telation was .83 which, when exte 


pt teg Pĉarman-Brown formula, gave a” 
€d reliability of .91. 
Bombing Error Criteria 
w is able 


t € ac Š A $ 
0 bop, curacy with which an aircre 


. s 
combat targets may be considered @ 


a near ultimate criterion of its effectiveness. 
Despite the high relevance of bombing data 
to the problem of evaluation of aircrew effec- 
tiveness, many conditions exist which detract 
from its utility. Chief among these are: (1) 
extremely variable conditions under which 
bombing must be accomplished; (2) severe 
limitations orf the possibility of determining 
exactly where bombs are dropped in combat; 
and (3) limits upon the number of crews for 
whom the opportunity to perform the com- 
plete bombing operations exists (only the lead 
crew in formation bombing performs the com- 
plete bombing operation). Nevertheless, all 
available data were collected concerning the 
bombing accuracy in combat of the 94 FEAF 
crews. No combat bombing data whatsoever 
were available for 50 of the crews (no lead 
experience). Each of the remaining 44 crews 
had had one or more opportunities to lead 
formations on which it had been possible to 
secure photographs of where the bombs ac- 
tually dropped. The bombing accuracy data 
consisted of circular errors for each of these 
bombing missions. These errors were ex- 
pressed as the linear distance between the 
mean point of impact of all the bombs identi- 
fied in the strike photograph and the assigned 
target. The number of circular error measures 
available for each of the 44 crews ranged from 
1 to 8 with a mean of 3.16. 


326 


Inspection of the distribution of the bombing 
error data disclosed a markedly skewed dis- 
tribution. A log transformation of these data 
yielded data with essentially normal distribu- 
tion. The reliability of the transformed data 
as estimated, again by utilizing Horst’s pro- 
cedure, was not significantly different from 
zero. 

Table 3 presents an analysis of the variance 
of the bombing data into between-crew differ- 
ences and within-crew differences. It can be 
seen readily that there is approximately as 
much variance between the errors made by 
the same crew on different missions as there 
is between the mean performance of different 
crews. The difference between the mean 
bombing accuracy scores of these crews appears 
to be wholly unrelated to crew differences. 


Relationships Among the Criteria 


In order to complete the comparison of the 
three criteria, each of the eleven superior rating 
variables was correlated with both the socio- 
metric index of “‘on-crew” choice and with the 
mean bombing error of each crew for whom 
we had bombing data. Table 4 presents these 
correlations. 

Attention is called to the correlations be- 
tween the bombing data and the superiors’ 
ratings. It is quite apparent that the raters 
utilized the official bombing data as a source 
of information in making their ratings. Mean 
differences in the official bombing records of the 
crews serve to “contaminate” the raters’ judg- 
ments on all rating variables, although the 
contamination is more marked for certain 
variables than for others. It is possible to 
extend the factor analysis of the eleven superior 
ratings to include the correlation with the 


bombing data by the following loadings on the 
five factors: 


Table 3 


An Analysis of the Variance of Transformed 
Bombing Accuracy Data 


Source of 


Sum of 
Variance Squares Df Variance F ratio 
Between Crews 208.56 33 6.320 1.34 
Within Crews 451.82 96 4.706 
Total 660.38 129 5.119 


John K. Hemphill and Lee B. Sechrest 


Table 4 


Correlations Between Eleven Variables of Aircrew 
Performance Rated by Superiors and (1) the Index 
of “On-Crew” Choice and (2) Bombing 
Accuracy Data 


“On-Crew” 
Choice Bombing Data 
ae 


Rating Variable N r N r rcort 
1 Technical Skill 90 .20 41 .58 .61 
2 Completion of Missions 88 .10 41.62 .70 
3 Bombing Accuracy 80 .36** 41 58 58 
4 Leadership 89 .13 41 .63 «67 
5 Consideration 90 10 41 42 48 
6 W.W. Other Crews 90 06 41 47 56 
7 W.W. Superiors 90 ii Al .57 0 
8 Care of Aircraft 90 .26* 41 .31 .38 
9 Following SOP 90 15 41 40 47 
10 Military Bearing 90 .25* 41 .27 30 
11 Over-all Value 90 18 41 47 54 


1 Corrected for restriction of the range of the rating 


“variables (2). 


* Significant at the .05 level. 
** Significant at the .01 level. 


I. General Reputation, .10; II, Objective 
Achievement, .80; III. Technical Performanc® 
40; IV. Attention to Aircraft and Procedure 
.30; and V. Social Performance, .25 

The major contamination is with Factor 1 
which has its principal loading on the rating 
variable, Successful Completion of Missions: 
The original hunch responsible for the inclusio” 
of this rating item was that it might ident) h 
variance in crew performance associated wit 
low motivation and/or tendencies to abaa 
missions. It appears, however, that inform A 
tion concerning the officially recorded P 
formance of the crews determined much 
this rating. 

The sociometric index of “on-crew’ 
does not appear to be related to any mam 
extent with the superiors’ ratings. It may — 
noted that the larger of these small correlation 
are with rating variables which have B 
loadings on the Technical Performancé y 
Attention to Equipment and Procedure fac ve 
but small loadings on the Social Performa? rs 
factor. This fact suggests that nominal?’ f 
may well have been made on the bas’, , 
profficiency rather than just popularity- 


> choice 


EE oo 


' 
a 


-m a Oe 


Three Criteria of Aircrew Effectiveness in Combat Over Korea 


_ Sociometric data may provide a second and 


relatively independent estimate of the per- 
formance of the crew. The correlation be- 
tween the index of ‘‘on-crew” choice and the 
objective bombing accuracy data was found 
to be .33. This suggests that the sociometric 
choices were also influenced by the unreliable 
bombing information but to a lesser degree 
than most of the superiors’ ratings. 


Discussion and Conclusion 


The finding of substantial and significant 
correlations between the objective bombing 
accuracy data, for which we had estimated a 
reliability of zero, and the eleven superior 
rating variables has general methodological 


implications for the development of criteria. 


An explanation of these correlations can be 
found in a possibility of a “contamination” of 
the judgments of superiors. The results of 
each bombing mission are widely publicized 
among the personnel of the wings and squad- 
Tons. In fact, it is standard procedure to 
hold a critique of the mission on the morning 
of the day following the mission. In addition, 
Mission results, expressed in terms of bombing 
errors, are made part of official records which 
are maintained and used in determining which 
Crews will be given an opportunity to lead 
Urther missions. It can be expected, there- 
fore, that each rater could have estimated the 
official bombing accuracy records of each crew 
with a relatively high degree of accuracy. 


327 


The fact that these records represented little 
other than chance crew achievement was, of 
course, unknown to the raters. This com- 
monly shared, but unreliable, information con- 
cerning the performance of the various crews 
tended to produce both spuriously high reli- 
ability of the superior ratings and the spurious 
correlations with the unreliable bombing data. 

A general question is raised concerning the 
dependability of rating data as criteria. In 
situations where objective achievement infor- 
mation is available, we may expect that raters 
will utilize such information in the process of 
forming the judgments they express in their 
ratings. If these achievement data reflect 
reliable performance, they will, of course, add to 
the dependability of the ratings. However, 
should the achievement information be basi- 
cally unrelated to differences in the perform- 
ance of the individuals or units being rated, this 
fact is likely to be overlooked when a test of 
the agreement between raters proves the rating 
to have “adequate reliability.” Raters may 
agree in their knowledge of the achievement 
records but be in error about the meaningful- 
ness of such records. 


Received August 22, 1952. 
Early publication. 


References 


1. Horst, P. A generalized expression for the reliabil- 
ity of measures. Psychometrika, 1949, 14, 21-24. 

2. Thorndike, Robert L. Personnel selection. New 
York: Wiley, 1949, p. 173. 


Employment Prognosis of the Post-Poliomyelitic 


Leonard V. Wendland 


George Pepperdine College, 


Within recent years the public has frequently 
been stimulated to think of its responsibility 
toward the physically handicapped person. 
There always have been those who were 
physically handicapped. To a large extent 
the crippled or deformed person either has been 
an object of special pity and attention, or he 
has been largely ignored. Certain crippling 
diseases, such as poliomyelitis, have rather 
consistently contributed their share of handi- 
capped persons. Of those who contract polio- 
myelitis, approximately one-third retain some 
degree of residual physical involvement, while 
others are left with less noticable degrees of 
muscular weakness (3). 

The individual’s need to make a satisfactory 
psychosocial adjustment demands economic 
independence. This demands, on the part of 
the physically handicapped, as it does of the 
non-handicapped person, that he seek and 
retain employment. This employment must 
provide conditions and remuneration so that 
the employee will feel financially and emotion- 
ally secure. In this study we have surveyed 
the employment history of 151 post-polio- 
myelitic subjects, confining our analysis to the 
past 10-year period. 


General Descriptive Data 


Source of Subjects. In conjunction with a 
physical aftereffects survey carried out by the 
Staff of the Orthopaedic Hospital, Los Angeles, 
California, in 1940, the Research Staff of the 
hospital carried on a study of the vocational, 
educational, and social status of the patients. 
The physical aftereffects study was made of 
randomly selected clinic patients, all of whom 
had had Poliomyelitis and all of whom 
had been clinic patients of the Orthopaedic 
Hospital (1, p. ii). Clinic patients are by 
definition charity patients and must be under 
21 years of age when treated. This age-limit 
becomes an important selective factor in the 
mean age of this sample of subjects. Such 
factors as sex, race, nationality, religion, and 


328 


Los Angeles, California 


environmental background are accidental rather 
than selective. 

Of the 794 patients available for study, those 
16 years of age and older were contacted in the 
1940 survey. Of the 794 patients, a total 
of 437, 203 males and 234 females, were 
interviewed by the Research Staff. The 437 
patients represent all who were willing to 
cooperate in the study. The bulk of the data 
collected, in this 1940 study, has not been 
published, though a number of excerpts have 
been published (4, 5). 4 

In the present study every possible attempt 
was made to trace the present location of each 
of the 437 patients. Those patients residing in 
California, who were located and were willing, 
were used as subjects. Of the 437 patients, 
40.3 per cent were located. Of those patients 
located, 85.8 per cent, or 151, were interviewed, 
8.5 per cent were unwilling to cooperate an 
5.7 per cent were deceased. 

Age. The median age for the male subjects 
is 32.2 years, for females the median age ÍS 
34.7 years, and in terms of the total sample the 
median age is 33.9 years. An age-distributio? 
such as we have would lead one to postulate 
that this group of subjects is at the presen 
time approximately in what may be thoug? 
of as the “prime of life.” ò 

Age of Onset. Psychosocial adjustment t 
the disabling aftereffects of poliomyelitis es 
be directly or indirectly related to the age / 
which the subject contracted the disease- ied 
the group studied, 42.6 per cent of the ma i 
and 42.3 per cent of the females were un F 
the age of 4 when they contracted the diseni 
and approximately 60 per cent of the ma a 
and 70 per cent of the females had the dise® 
before their seventh birthday. of 

Residual Physical Involvement. The locale st 
physical involvement is important in at leas 
two ways: (1) involvement in some aren 
less disabling than others for the individu’ 
and (2) certain types of involvement ca” ne 
much more easily concealed than others. T s€ 
degree of residual involvement is likew" 


important in evaluating the psychological 
SS aftereffects. One general criticism by the sub- 
jects was that prospective employers tended to 
overemphasize their apparent physical limita- 
tions without giving them an opportunity to 
prove what they could do. 

At the present time none of the subjects are 
under treatment for residual paralytic after- 
effects. Each of the subjects was rated as to 
the apparency of the residual aftereffects on 
a five-point scale. A “0” apparency implies 
no apparent residual aftereffects, while a person 
rated as a “4” apparency is so seriously crippled 
that he is incapable of movements without the 
aid of major prosthesis and spends all of his 
time in a wheelchair or bed. In this group of 
Subjects 62.3 per cent have either no apparent 
residual paralysis or only slight apparent 
involvement. 


Employment History 


The Degree of Physical Disability and Em- 
ployment. It may be assumed that there is 
a rather direct relationship between the degree 
of physical involvement and employability. 
A comparison was made of males and females 
who are employed with those who are unem- 
ployed. Among the males, 89.7 per cent are 
employed, and of the females 91.5 per cent are 
employed either within their own homes or 
otherwise. Approximately 57 per cent of the 
unemployed males have either no or only slight 
physical disability, while approximately 83 per 
Cent of the unemployed females have either 
no or only slight physical involvement. 

It is difficult to know how to interpret state- 
ments by seriously disabled individuals who 
Say their physical condition does not interfere 
With their work. Among the males 57.3 per 
Cent indicated that, in their opinion, they have 

€tinite physical barriers to certain types of 
“mployment. However, 78.0 per cent of this 
Same group said that their physical disability 
96s not interfere in their present employment. 
o-thirds of the females feel that in spite of 


Physica) inyi ir physical condition 
docs 1 involvements their pë + 
taeng Ot Interfere with tele ese! PE 


hota 
Prony, tbtion of Occupations Engaged în by ies 
ae e Dictionary of Occupational 7 des 
“ble q Used to classify all those emp tes 
Silves a summary of males and fema 


Employment Prognosis of the Post-Poliomyelitic 


329 
Table 1 
Summary of Occupations 
p Male Female 
Occupational 

Description N % N % 
Professional 20 29.4 6 7.2 
Semi-professional 3 4.5 3 3.6 
Sales 8 118 2 24 
Clerical 8 11.8 16 19.4 
Service 1 1.5 AL 53:0 
Agricultural services 2 2.8 0 0.0 
Skilled 16 23.5 3 3.6 
Semi-skilled 3 4.5 1 1:2 
Unskilled 0 0.0 2 2.4 
Unemployed 7 10.2 6 7.2 
` Total 68 100.0 83 100.0 


in terms of this standard description of occupa- 
tions. Itis worthy of note that 29.4 per cent 
of the males are employed in what is rated as 
professional work. When the number of male 
subjects engaged in work classed as professional 
is compared with the general population we 
note that while 29.4 per cent of this group are 
so employed, only 4.5 per cent of the general 
Los Angeles population are employed in work 
of this nature (6, p. 190). 

Type of Employer. In “business for them- 
selves” accounts for 26.4 per cent of the male 
subjects. The next largest percentage in this 
group is absorbed by those in the small busi- 
ness concern, while 19.1 per cent of the group 
are employed in some large public corporation. 
Next to domestic work as housewives, the fe- 
males in this study are primarily attracted to 
public service employment. Of the females, 
9.6 per cent, however, are in business for them- 
selves. 

Weekly Income in Relation to Degree of Dis- 
ability. The median weekly income for males 
is approximately $74.64, while for female sub- 
jects it is approximately $59.71. About 25 
per cent of the male subjects earn more than 
$100.00 per week, while a small number have 
weekly incomes in excess of $200,00. 

Part A of Table 2 gives the mean weekly 
feats ler males and Part AF Table 3 gives 
siila (ata for females, in terms of = 

pysical hivelvenient. Parts B of Tables 7 
and 3 make comparisons of groups © el JS 
with varying degrees of residual paraly sis and 
mean incomes, giving the statistical level of 


330 Leonard V. 
confidence for each comparison. It might be 
assumed that individuals with little or no 
residual paralysis should have a greater mean 
income than those more seriously handicapped. 
Part B of Table 2 indicates that this assump- 
tion is not borne out. Part B of Table 3 does, 
however, confirm this contention. Some of the 
relationships in Parts B in Tables 2 and 3 have 
little statistical significance, they are given, 
however, to indicate the spread found in the 
relationships. 

These earnings reflect a significant rise in 
weekly income when compared with the earn- 
ings of this group prior to World War II (5). 
Since 29.4 per cent of the males have profes- 
sional employment, the median income of these 
male subjects has been compared with the 
median income of professional employees of the 
general Los Angeles population. The annual 
median income for subjects employed in pro- 
fessional positions is about $5,124.50, whereas 
the median income for comparable employees 
of the general Los Angeles population is 
$3,972.00 (6, p. 294). By interpolation we 


Table 2 


Mean Weekly Income of Employed Males in Terms 
of Handicap—with Level of Confidence 
of Relationships 


A 
Total 
Degree of Mean 
Apparency Income N % 
4 50.00 1 1.6 
3 96.46 10 16.1 
2 126.56 14 22.6 
1 79.44 27 43.6 
0 94.51 10 16.1 
62 100.0 
B 
Inter-Grow Statisti 
S i ical Level of 
Relationship Confidence in Rank Orie 
0-1 

14 30 

2-3 40 

24 40 

3-4 -50 

0-4 -50 

1-2 -05 

1-3 30 

0-2 -40 

0-3 -90 


Wendland 


Table 3 


Mean Weekly Income of Employed Females in Terms 
of Handicap—with Level of Confidence 
of Relationships 


A 
Total 
Degree of Mean = 
Apparency , Income N % 
4 22.57 3 8.6 
3 49.74 8 22.9 
2 93.50 7 20.0 
1 60.13 13 37.1 
0 62.50 4 11.4 
35 100.0 
B 


Statistical Level of 


Inter-G 
Ship Confidence in Rank Order 


Relationship 


14 01 
04 05 
3-4 05 
1-3 10 
2-3 10 
2-4 10 
0-3 30 
0-1 80 
1-2 10 
0-2 40 


find that the median income for the subjects 
under study is approximately at the seventy” 
first centile level when compared with the 
median income of the general Los Angeles 
population. ; re 
About 24 per cent of the male subjects he 
in some form of skilled employment. a 
median annual income for these males is apPT? 
imately $3,874.50, whereas the median inom 
for comparable employees of the general on) 
Angeles population is $2,746.00 (6, P- hao 
By interpolation of data provided, we find te 
the median income of the subjects studie iri 
probably in the vicinity of the eighty-th 
centile level when compared with the med" 3 
income of the general Los Angeles populat” se 
Since 19.4 per cent of the females hold 5° a5 
type of clerical position, a comparison y of 
made between the median annual income e 
these subjects and the median annual incr es 
of like employees of the general Los Ang fot 
population. The median annual incom© “js 
these subjects is approximately $2,999.90 al 
compared with the $1,728.00 median a?” 


i 


Employment Prognosis of the Post-Poliomyelitic 


- income for the general Los Angeles population 
(6, p. 295). By means of interpolation it may 
be ascertained that the median annual income 
for the subjects employed in clerical positions is 
at approximately the ninety-fourth centile level 
when compared with the median annual income 
of the general Los Angeles population. 

The comparisons made above indicate that 
the annual income for males employed in pro- 
fessional and skilled positions, and for the 
females employed in clerical positions, is in 
each instance above the median annual income 
of the general Los Angeles population in similar 
employment. In most other comparisons it 
was found that the income of the subjects of 
this study compared favorably with the general 
Los Angeles population. 

Relation of Educational Level Reached and 
Weekly Income. In making a study of the 
educational level reached as it related to the 
mean weekly income, it was found that, gen- 
erally, the higher the educational level reached 
the higher the income. Table 4 indicates that 
for males, with the increase of education there 
is an increase of the mean weekly income in 
nearly every instance. Table 5 shows that 
those females with the most education are to 


Table 4 


The Significance of Mean Income of Male Subjects 
in Relation to Education 


A 

Weekly 

Gro Educational Level Total Mean 
Code eR ai me N Income 
1 Professional degree 7 $162.15 
2 Graduate study 4 bee 
College graduate ; f e 
ye urs aliki sudy abe 
g ti. Years alll igl ud s gazo 

6 “1 year additional study Hål 

rd 


Hig 
geh school graduate ` 57.98 
'gh school non-graduate Sg 


ÅS, 
Inter. istical Level of 
3-7 05 
1~4 05 
1-5 .10 
1-6 10 
1~7 10 


- ae 


331 


Table 5 


The Significance of Mean Income of Female Subjects 
Employed Outside of Home in Relation 
to Education 


A 
i Week]; 
Group Educational Level Total Man 
Code Reached N Income 
1 Professional degree 1 $269.00 
2 Graduate study 3 84.58 
3 College graduate 2 58.75 
4 1-3 years additional study 10 55.92 
5 0-1 year additional study 15 50.05 
6 High school graduate 1 41.75 
7 High school non-graduate 1 48.00 
8 Grade school only 1 53.20 
34 


Statistical Level of 


Inter-Group 
Confidence in Rank Order 


Relationship 


1-4 -01 
1-5 -01 
2-5 .01 
1-2 -05 
1-3 05 
2-4 -05 


be found in positions in which the highest 
salaries are earned. Parts B of Tables 4 and 
5 give the statistical significance for some of the 
relationships between educational level reached 
and mean weekly income. J 

The Role of Prosthesis and Employability, 
Approximately one-half of the 61 employed 
males do not wear any type of prosthesis, 
Another 10 per cent of the males wear only 
some form of orthopaedic shoe; that is, a shoe 
with a lift built into it or some other form of 
correction. 

Mbd 22 per went of ‘the Females whoa 
aet gutside of the home do not require 
any fort of prosthesis, While 8.3 per cent of 
this group require leg braces and are confined 
to a wheel chair. Approximately 55 per cent 
of the females require some form of prosthesis 
as compared with 35 per cent of the males. 

Of those males and females who are at present 
unemployed, 71.4 per cent do not require any 
form of prosthesis. Accordingly we may as- 
sume that those males and females who are 
unemployed are not primarily so, due to 
physical involvement, but that there are other 
more significant factors. 


332 Leonard V. 

Opportunities for Advancement in Present 
Employment. Employee morale is heightened 
when one feels that his employment offers 
opportunities for advancement. About 57 per 
cent of the males and 33 per cent of the females 
feel that they have employment which offers 
opportunities for advancement. 

Job Stability. Another indication of occu- 
pational adjustment may be inferred from job 
stability. It is assumed that a lesser number 
of employment shifts made within the last 
10-year period is indicative of a better occupa- 
tional adjustment than is frequent employment 
shifting. 

About 25 per cent of the male subjects have 
been employed by the same employer for at 
least the last 10-year period; and 57.1 per cent 
of the male subjects have had a maximum of 
3 different employers during the last 10 years. 

About 78 per cent of the females have had 
no more than 3 different employers during the 
last 10-year period. About 18 per cent of the 
females have been employed in the same estab- 
lishment during the last 10 years. 

An analysis of data was made regarding the 
relationship of job stability and residual physi- 
cal involvement. The indications are that 
those subjects with no evident residual paralysis 
and those most severely handicapped had a 
greater tendency to shift jobs than did those 
subjects with moderate to moderately severe 
residual physical involvement. 

Subjects Who Had Been Employed in “War 
Work.” The last 10-year period includes a 
number of years in which the nation’s economy 
was organized in terms of the materials required 
by the war effort. During the war years labor 
was at a premium. Individuals who, previous 
to the war, had had difficulty finding employ- 
ment now found that their services were being 
solicited. About 62 per cent of the males and 
2 75 per cent of the females have been em ployed 
in “war work.” All of the subjects who took 


“war work” had had previous employment or 
were self-employed, 


In this study 10 males were rated as having 
no apparent residual involvement. Of these 
males, half saw active service during the recent 
war, in some branch of the armed forces, 


Summary 


In this paper we have noted the following 
relevant data obtained from a study of the 


Wendland 


past 10-year employment history of 151 post- 
poliomyelitics. 

1. About 90 per cent of the males and 43 
per cent of the females are employed outside 
of the home. 

2. About 26 per cent of the males and 10 
per cent of the females are owners of business 
establishments. l 

3. The median weekly income for males 18 
approximately $73.64, while that for females 15 
$59.71. 

4. There is, in some cases, a significant 
relationship between education and income, 
with the higher incomes being earned by those 
with the most education. 

5. About 55 per cent of the females and 35 
per cent of the males require some form of 
prosthesis. Neither serious physical involve- 
ment nor the need for prosthesis seems to be & 
major factor for those subjects who are at 
present unemployed. 

6. About 25 per cent of the males and 18 
per cent of the females have been employe 
in the same employment for the last 10 years: 

7. It was indicated that 62 per cent of the 
males and 28 per cent of the females have ha 
employment related to the nation’s war effort: 
Half of the males with no evident residual 
involvement served in the armed forces. 


8. The data would indicate that the employ 


ment prognosis of the post-poliomyelitic com- 
pares favorably with that of the non-handi- 
capped population of the Los Angeles area 


Received September 28, 1951. 


References 


P l- 
1. Carroll, R. L. Aftereffects survey of infantile para > 


ysis at Orthopaedic Hospital, Los Angeles. 
published, Orthopaedic Hospital Library, 
Angeles, California, 1941. 
2. Dictionary of occupational titles. Volume }; 
nition of titles. (Second edition, Marc 7 
Government Printing Office, Washington, D. 
1949. ANA 
3. Facts and figures about infantile paralysis. ly- 
York: National Foundation for Infantile Par? 3 
sis, Publication No. 59 (second edition), imi 
4. Lowman, C. L., and Seidenfeld, M. A. A preli jo- 
nary report of the psychosocial effects of PO 31. 
myelitis. J. consult. Psychol., 1947, il, 3 ork 
5. Seidenfeld, M. A. Psychological elements 1 H ont 
interference from physical disability- f 
sult. Psychol., 1947, 11, 326-333. pu 
6. Statistical abstracts of the United States. U- $: : g 
reau of Census. (Seventh edition.) W25 
ton, D. C., 1949, 


Los 


ji 


DO 


A 


J n >- Eai a > 


: 


= 


A Job Preference Survey for Industrial Applicants* 


W. F. Long** 


Purdue University 


In a comprehensive discussion of the nature 
of interests, Super (10) states that interests 
have probably received more attention from 
vocational psychologists during the past gen- 
eration than any other single type of human 
characteristic, including intelligence, aptitudes, 
and personality traits. 

In spite of this considerable expenditure of 
effort, few, if any, of the large number of inter- 
est assessment devices which are currently 
available, are entirely satisfactory for use in 
business and industry as an aid in the selection 
and placement of unskilled and semi-skilled 
workers. All of those in general circulation 
have one or more limiting characteristics for 
this purpose. Almost all are intended to focus 
interests into occupational areas, nearly always 
at the professional or semi-professional level. 
The few inventories and information tests 
which have been constructed for use at lower 
Skill levels are too specific for general use. 
Many of the inventories or questionnaires 
ìnclude items which require expressions of 
interests concerning activities, jobs, hobbies, 
“nd the like, about which the respondent could 

ave but limited first-hand information. Per- 
aps the severest limitation to the utilization 
of Published interest inventories for selection 
‘nd placement is their susceptibility to falsifi- 
“tion by respondents who attempt to match 
€r expressed interests to those they believe 
“Cessary for a desired job. 
his Study is an attempt to develop a Job 
reference Survey! which is relatively-free from 
© mentioned limiting characteristics. The 


X This: artisla: 3 
b hy, ial E fe the dares of pecs ot pee 
Cor) e under the direction of Professor C- H. Laws’ 
thes on afa Earnie ui | miie unt 
Uni, The K In the Purdue Pieralli M éf ihe 
eq Stat iT is now a major in the Ait ‘ote hat 
Hare Nhe] R assigned to duty as Executive Of on 


h seq, Resow 
Oke Ca, rch Laboratory, Human Reso Air 
Tee pactter, Air Training Command, Lackland A 


n a thesis submitted in 


‘ ase, S, 

dig Copy 2 an Antonio, Texas. ; d 
ight 2 foundation an 

Que buted b by the Purdue Research Forentet, ur- 


Rivers? “he Occupational Resear 
» Lafayette, Indiana. 


Survey is intended to sample job activity 
preferences at a non-professional level, The 
interests expressed are reflected as relative 
preferences for antagonistic pairs of job activity 
types, e.g., routine versus creative work 
rather than for occupational areas. The ma- 
jority of items describe unitary tasks with 
which most respondents from a population of 
business and industrial applicants would be 
familiar. The form and content of the items 
were chosen in an attempt to enhance validity 
by simplicity of presentation while at the same 
time minimizing the possibility of falsification 
by job applicants. 


Construction of Job Preference Survey 


Selection of Job Characteristics for Study. In 
selecting those job characteristics to be studied 
as components of work interests, reference was 
made to psychological literature and to several 
texts and treatises on job analysis. Although 
it was found that fundamental job character- 
istics have not been clearly identified, eight 
antagonistic pairs of job activity types were 
selected for study as possible independent 
components of work interest.? These com- 
ponents are: 1. Routine—Creative; 2. Indoor— 
Outdoor; 3. Repetitive—Varied; 4. Respon- 
sible—Non-responsible; 5. Hazardous—Non- 
hazardous; 6. Sedentary—Bodily Active; 7, 
Isolative—Gregarious; and 8. Precise—Ap- 
proximate. 

Construction of Items. The paired-statement 
item form was used because of the nature of 
the problem at hand which makes it necessary 
to require a choice to be made between two 
activities representative of the extremes of a 


hypothetical continuum, The couplet form 
is also belteved to be (ess contusing to the low 


+ After this project had developed so that any madi- 
fication of the components under study WO! aD 
impractical, reports by Kelley (4) and Vernon 
ara located which suggested two additional Re 
nents viz., Things or Mechanisms—People and yar al 
‘Activity Physical Activity, Tt is (atone to include 
these in an extension of the current study. 


333 


334 W.F. 


ability job applicant and therefore less likely 
to introduce extraneous variance. The pairs 
were formed so that the prestige or social 
acceptability level and the performance skill 
level were judged by the author to be similar. 
It obviously was also necessary to make each 
pair of statements suggest a choice within 
one component only. Forty items were con- 
structed for each of the eight components. 

The instructions given to the subject, which 
include a sample item from the Sedentary— 
Bodily Active component, follow: 


In the Job Preference Survey you will find 
job activities listed in pairs. Choose the one 
of each pair which you would like best if you 
could perform both activities equally well, and 
would receive the same pay. Make a check 
mark (4) after the job activity in each pair 
which you would like best. Mark one and 
only one in each pair. 

Example: 


Keep records of materials in stock___( 
Deliver materials to machine operators 


) 
(Vv) 

For the example pair of job activities, you 
would place a check mark as is shown on the 
line to the right of “Deliver materials to ma- 
chine operators” if you would like best to 
perform that one of the two job activities. 

Work quickly. Do not spend too much 
time thinking about one pair of activities. 
You probably will be able to complete the 
Survey in just a few minutes. 

Example items from each of the other seven 
components follow: 


L REG 


Pack glassware for shipment. 
Design decorative patterns for 
glassware. 
Operate a large batch mixer in 
a foods plant. 
Operate a large mixer making 
ready-mix concrete. 
Pack shirts in a factory shippi 
y shippin 

department. prag 
Wrap articles in the delivery de- 
partment of a retail store. ` 
Supervise a crew of carpenters. 
Work as an expert carpenter. 
Pour melted metals from over- 
head ladles into molds. 

repare molds for making I 
castings, eae 

ook up statistical data i 
reference library. e 

perate the | i 
hears: oan desk of a li- 


Lay individual til r 

SET IN es for a shower 
Lay concrete blocks for a build- 
ing partition, 


2. I-O 


3. R-V 


4. R-NR 
5. H-NH 


Long 


Construction of Experimenial Forms. In 
order to keep administration time within 
reasonable limits, four forms were constructed, 
each including 80 items covering two of the 
components. Within each form the com- 
ponents were paired with the intent to minimize 
similarity of item content and the items were 
arranged in a randomized fashion to reduce 
the effect of repetition of component and con- 
tinuum extreme. 

Administration and Scoring of Experimental 
Forms. By arrangements made through the 
offices of the Occupational Research Center, 
Purdue University, from 166 to 174 copies of 
each of the experimental forms were completed 
by job applicants from 20 companies. Each 
of the experimental forms was scored for both 
components in terms of the number of pref- 
erences indicated for one of the extremes © 
each component. For example, if a subject 
indicated a preference for indoor over outdoor 
job activities in 28 of the forty items, his score 
in that component would be 28. 

Item Analysis and Revision of Experimental 
Forms. Based upon upper-lower half splits, 
a measure of internal consistency was deter- 
mined for each item in a component as well 45 
a measure of relationship of response for each 
item of a given component with the total score 
on the other component in the same form- 
These relationships were quantified using ~~ 
Values (6). The best 25 items of the original 40 
in each component were selected for inclusion m 
a single revised form, Form X. Those items 
were retained which had the highest intern 
consistency values and at the same time the Jow- 
est relationship with the other component 17 t 
first experimental form. No item was retain’ 
which had an internal consistency D-Value oF 
less than .45 or a D-Value based on the “othe! 
criterion which was larger than its intern 
consistency D-Value. In order to reduce E 
effect of item position upon subject response? 
Form X, the items were arranged in a ran" 
manner. The items were also arranged so tha 
either extreme of the components was tis 
first an approximately equal number of time A 

Administration and Scoring of Form X. pe 
total of 260 usable completed copies % 4 
Survey were returned by twelve addition”. 
companies which had agreed to participat? 
the project. 


Job Preference Survey for Industrial Applicants 335 


The completed Survey forms were again 
Scored in terms of the number of preferences in- 
dicated for one of the extremes of each compo- 
nent., Thus eight scores were secured for each 
subject. 

Double Cross-Validation Design for Further 
Revision. The 260 completed copies of Form 
X were divided between two equal groups, A 
and B, in order to permit use of the double 
cross-validation method as described by Katzell 
(3). This design provides for scoring of Group 
A responses using a key based upon Group B 
responses and vice versa. A final key is devel- 
Oped using the item analysis results from both 
Samples. 

The basic principle underlying the double 
Cross-validation method is the enhancing of 
the reliability of findings through the replica- 
tion of experiments. Greater confidence can 
be placed in congruent results from two or 
More independent samples than in the results 
of a single experiment, where the total N is 
the same in the two experimental designs. 

When applying the method to an = 
Consistency analysis, two estimates of relia- 
bility are secured, each based on a larger sample 
than probably would have been available using 
the conventional method. These two reliabil- 
ity estimates are probably underestimates of 
the reliability of the final key because if the 
same number of items are included in all three 
keys, those in the final key will be more stable 
than those in the two sample keys. 

Item Analysis of Form X. Item counts were 
completed separately for Groups A and B. 
Based upon upper-lower half splits, a measure 
of internal consistency was determined for each 
component as well as a measure of relationship 
of response for each item of a given component 
with the total score on the other seven compo- 
nents. These relationships „were quantified 
into omega values (7) which in turn are eo) 
translated into probability values, i.e., the 
Confidence level at which the differences be- 
‘ides upper and lower halves may be said to 

€ different from zero. f . 
Development of Scoring Keys. Using the item 
Analysis results from Group A, the best 20 items 
tom the total of 25 in each component were 
tetained to make up a key to be used to re-score 
€ forms from Group B and vice versa. The 
cision to retain 20 items was arbitrary, being 


a compromise considering reliability and testing 
time. Items were retained so as to maximize 
internal consistency while at the same time 
minimizing relationships with other compo- 
nents. From the data for Group A it was 
possible to select 146 of the 160 items which 
discriminated between the upper and lower 
halves at less than the one per cent confidence 
level on an internal consistency basis. The 
remaining items discriminated at less than the 
12 per cent level. From the data for Group B 
it was possible to select 155 of the 160 items 
which discriminated at the one per cent level, 
The remaining items discriminated at less than 
the 12 per cent level. 

Similarity of Groups and Keys. None of the 
differences between the means and standard 
deviations for the two samples is significant 
at the five per cent level of confidence. Thus, 
it can be said that neither the two forms nor the 
two sample groups differ appreciably, 

Intercorrelation of Scales. The intercorrela- 
tions between the scales for Groups A and B 
are presented in Table 1. The development of 
Scale 9 listed in the table will be discussed 
subsequently. The values given are product- 
moment coefficients, which were considered to 
be more suitable than tetrachoric Coefficients 
in view of the small size of the samples (2). 
Although the size of the correlations varies 
somewhat between samples, the general rela- 
tionships are consistent. 

The intercorrelations among the Routine— 
Creative, Repetitive—Varied, and Responsi- 
ble—Non-responsible scales indicated that 
revision would be desirable. Accordingly, a 
composite scale was constructed using items 
from the Routine—Creative and Repetitive— 
Varied groups. Items were selected in an 
attempt to make the relationship with the 
Responsible—Non-responsible scale as small 
as possible. Using the composite scale keys, 
the Survey forms were scored in the same man- 
ner as previously. From Table 1 it can be 
seen that the composite scale, identified as 
Routine—Varied, is also substantially corre- 
lated with the Responsible—Non-responsible 
scale. It would appear that the responsible 
and non-responsible activities described in the 
items differed substantially in terms of the 
variety of activities involved as well as in terms 
of degree of responsibility. It therefore seemed 


336 W. F. Long 
Table 1 
Reliabilities and Intercorrelations of Scales in Form A* 
Scale 
Scale | Group 1 2 3 4 5 6 7 8 9 
1. Precise A 81 —S50 16 32 =05 23 ár 30 —37 
—Approximate B 86  —50 16 35 —05 38 39 —46 —44 
2. Routine A 8 +19 =11 37 -67 —27 67 
—Creative B 85 —50 —17 15 -83 -i 65 
3. Hazardous A 78 00 —07 28 OL —16 =17 
—Non-hazardous B 82 —03 -04 48 14 -383 —47 
4, Sedentary A 66 20 02 50 15 —03 
—Bodily Active B 6s 39 00 52 o6 —% 
5. Isolative A 70 —46 18 40 40 
—Gregarious B 79 —27 19 23 16 
6. Responsible A 79 12 —39 —70 
—Non-responsible B 81 07 —64 -11 
7. Indoor A 735 —04 —16 
—Outdoor B 74 —06 —05 
8. Repetitive A 78 
—Varied B 76 
9. Routine A 82 
—Varied B 80 
ti * Decimals omitted. Negative coefficients indicate positive relationships with reverse extreme of scale con- 
inuum. 
advisable -to drop the Responsible—Non- lieved that test-retest reliability coefficients 


responsible and to retain the composite Rou- 
tine—Varied scale leaving six scales in the 
Survey. Only two of the intercorrelations for 
both groups among the six scales retained are 
as high as .50. Most of the remainder are 
much lower. 

Reliability of Scores. The reliability esti- 
mates for the keys used to score Groups A 
and B are given in Table 1. These estimates 
were determined using the Kuder-Richardson 
Case III, Equation 20 (5). It is to be remem- 
bered that in the double cross-validation design 
these obtained estimates probably are under- 
estimates of the reliabilities of the final keys 
and that the larger of the two is likely to be 
the more accurate estimate. Under these as- 
sumptions, three of the scales would have 
reliability coefficients which are at least larger 
than .80, two well above .70, and one nearly.70. 
It is believed that these reliabilities can be 
considered to be satisfactory inasmuch as they 
are based upon only 20 items. It is also be- 


would be higher than the internal consistency 
ones here reported. 

Construction of Composite Key. The selec- 
tion of items to be included in the final key 
using a double cross-validation design is base 
upon consideration of the discriminative powe" 
of the items in both sample groups. In this 
study the significance of the differences between 
preferences for statements in a given scale made 
by the upper and lower halves divided on the 
basis of total score were translated into prob- 
ability values, i.e., the confidence level at which 
the difference may be said to be different from 
zero. The composite probability value for each 
item was determined by calculating the product 
of the two probability values.’ In the selectio® 

3 Lindquist (8) suggests that the use of chi-square, is 


a more appropriate means for determining composity 
probability. However, since the same rank order 
items is secured by either method, the less complex OP) 
was used. This method results in the selection OF yed 
same items, but according to Lindquist yields inflat! 
probability estimates. 


t 
| 
| 
j 


Job Preference Survey for Industrial Applicants 337 


of items, relationship to other scales was con- 
sidered as well as internal consistency. The 
five least satisfactory items in each scale were 
desighated for elimination leaving 20 items in 
ech scale. No item was retained that had 
à Composite internal consistency probability 
value greater than .0004. 
Frag iderations in Possible A pplication of the 
such reference Survey. With an instrument 
Gal as this Survey, interest profiles can be 
a4 A Fontrasted, with allowance for differences 
Peal. € means, as cannot be done with the 
ex Preference type of inventory wherein an 
Pressed interest for one activity usually 
mands an omitted preference for a different 
“a of activity, thus exaggerating profile dif- 
aa? Furthermore, in the Survey the pref- 
Bren required are within one category, not 
een categories. 
r ae not essential to have population norms 
ice ons in order to use the Survey 
Strengt} oF placement purposes since the relative 
individ h of interests is meaningful. Thus, 
standa. ual profiles can be compared to synthetic 
for fa. determined by job analysis designed 
empiric oun It is true, however, that an 
toadest Profile of successful workers (in the 
© most Sense of the term) for each job would 
Pirica] useful. Both the synthetic and em- 
ingful a piles probably would be most mean- 
k Riven. terms of the range of scores made by 
Midd] Proportion of the population, viz., the 
© 50 per cent. 


O] 


Summary 


e importance of interests to job success 
time faction has been recognized for some 
Owever, probably none of the large 
f interest assessment devices which are 
ay available are entirely satisfactory 


f . 

in ty ° in business and industry 3S an aid 
-skilled 

Wi 

Vi 


Rump, ero 
Curr, 


€ Placement of unskilled and semi 

p This study was undertaken to de- 
2 Job Preference Survey to ™ 

need, 


Daites final form of the Survey include: 


eet this 


120 


senti tement items, descriptive of six i 
intera Independent components of wor 
5 Varied; 2 


Te; x 
adoos Which are: 1. Routine— 
~~ Outdoor 33. Hazardous—N on-hazard- 


ous; 4. Sedentary—Bodily Active; 5. Isolative— 

Gregarious; and 6. Precise—Approximate. 

; Only one of the intercorrelations among the 
six scales is as high as .50 while most of the 
others are much lower. Three of the scales 
were found to have internal consistency relia- 
bility coefficients in excess of .80, two well 
above .70, and one nearly .70. These coeffi- 
cients are conservative estimates of the internal 
consistency of the scales and it is believed that 
test-retest reliability coefficients will be found 
to be appreciably higher. 

It was suggested that an instrument such 
as the Survey can be used profitably in job 
placement with reference to comparison of 
individual profiles with an empirical standard, 
or to comparison of profiles with synthetic 
standards based on job demands determined 
by job analysis. These comparisons would 
probably be most meaningful in terms of the 
range of scores made by a given proportion of 
the population, viz., the middle 50 per cent. 


Received October 5, 1951. 


References 


1. Fryer, D. Measurement of interests. New York: 


Henry Holt, 1931. i’. 
ndamental statistics in psychology 


2. Guilford, J.P. Fu 
vane a New York: McGraw-Hill Book 


Co., 1942, 244. a a 
3, Katzell, R. A. Cross-validation of item analyses, 


Educ. psychol. Measmt, 1951, 11, 16-22. 

Kelley, T. L. Report on an activity preference 
test for the classification of service personnel. 
OSRD Report No. 4484. Cambridge: Harvard 
Univ., 1944. 

. Kuder, G. F., and Richardson, M. W. The theory 
of the estimation of test reliability. Psycho- 
metrika, 1937, 2, 151-160. 

Lawshe, C. H. A nomograph for estimating the 
validity of test items. J. appl. Psychol., 1942, 
26, 846-849. f 

Lawshe, C. H., and Baker, P. C. Three aids in 
the evaluation of the significance of the differ- 
ence between percentages. Educ. psychol. 
Measmt, 1950, 10, 263-270. 3 

Lindquist, E. F. Statistical analysis in educational 
research. Boston: Houghton Mifflin Co., 1940, 


46. 
Strong, E. K., Jr. Vocational interests of men and 
women. Palo Alto: Stanford Univ. Press, 1943. 
10. Super, D. E. Appraising vocational fitness. New 
York: Harper & Bros., 1949, ae nem 
11. Vernon, P.E. Classifying high-gra e occupationa 
interests. J. abnorm. soc. Psychol., 1949, 44, 


85-96. 


Development of a Home Economics Interest Inventory! 


Hildegarde 
Towa State 


The study reported here is the first of a 
series planned to develop a special voca- 
tional interest inventory in the field of home 
economics. 

Such a special vocational interest inventory 
differs from the general inventories which are 
widely used in the guidance of students in that 
it is designed to differentiate among occupa- 
tions in a limited field. General vocational 
interest inventories sample a wide variety of 
interests in an attempt to include some inter- 
ests of persons in every occupation as an aid 
to classifying a student’s interests into broad 
types, such as persuasive or mechanical, or to 
identifying a student’s interests with those of 
one of a large number of specific occupations. 
In contrast, the items in a special vocational 
interest inventory are a sample of the interests 
which bear upon occupations within one field 
and the inventory may be used to compare the 
vocational interests of a student who has 
selected this field with the interest patterns 
of the specialists in it. 

The need for a special inventory in the field 
of home economics has been recognized for 
some time. Attempts which have been made 
to develop keys for occupations in home 
economics on the Strong and Kuder tests 
have resulted in differentiation of some occupa- 
tional groups from women-in-general. How- 
ever, the percentage of overlapping between 
the distributions is so great that the usefulness 
of the keys for individual guidance may be 
questioned, 

Basic Assumptions 


The first assumption made in this study is 
that interests of persons employed in a given 
occupation form a pattern. This assumption 
may be stated as a theory which is consistent 
with research in the area of interests: if a 
number of interests and aversions which are 

1 Thi is largely a summar: i 
a ea the Tiwi State Callege Lites, ae 


writer expresses sincere appreciation to Dr, Hester 
Chadderdon who directed the study. 


338 


Johnson 
College 


characteristic of persons in a specific occupa- 
tional group can be identified and their average 
intensity determined, it is possible to delineate 
a pattern in which the component parts are 
the interests and aversions and the average 
degree of intensity represents the dimensions 
of each part. An occupational interest pattern 
may be defined, then, as a configuration of 
intensities of interests and of aversions which 
are common to persons in an occupational 
group but not common to persons-in-general. 
The ease of determining such a pattern for an 
occupational group depends upon the homo- 
geneity of persons in the group and the extent 
to which the true interests and aversions which 
belong in the pattern occur in the sample of 
interests which are included in the inventory. 

The second assumption is that if a student 
has an interest pattern which is similar to the 
interest pattern of those in an occupational 
group, he will find greater satisfaction in that 
occupation than he would find in one in which 
the persons employed have a different interest 
pattern. It is reasonable to suppose that the 
average person in an occupational group is 
achieving vocational satisfaction. This satis- 
faction probably results from a reaction of 
interest rather than of aversion or indifference 
to the activities which are performed on the 
job and to the psychological stimuli which are 
concomitant with the job. If this is true, 
another person who has the same interests 
may be expected to respond in the same way 
to the same activities and psychological stimuli 
and he would, therefore, find satisfaction in the 
same job. Research indicates that vocational 
interests change relatively little during the 
college years and after graduation; hence, it 
is reasonable to suppose that if a college stu- 
dent has the same interests as professional 
persons in a specific occupation, he would 
respond in the same way, not only at the time 
his interests are measured as a freshman i” 
college, but a few years later when he is ready 
to accept a position. Assuming suitable abil- 


Development of a Home Economics Interest I, nventory 339 


ity, a student will find vocational satisfaction 
if he selects the occupation for which the 
interest pattern is most like his own. 


1 


Development of the Trial Form 


Items in a special vocational interest inven- 
tory should be a sample of the interests and 
aversions which are components of the true in- 
terest pattern of occupations within one field. 
These interests have not been identified and 
cannot, therefore, be sampled by the usual 
methods of sampling. However, they are inter- 
ests and aversions associated with a finite num- 
ber of jobs within a vocational field. Therefore, 
the assumption seems reasonable that an ade- 
quate analysis of occupations in the field would 
identify many of these interests and would 
therefore be the best basis for the selection 
ofitems. Analyses of the occupations in home 
economics were made by reading vocational 
guidance literature and by interviewing 26 
persons who had had vocational experiences. 
Additional items for the trial form of the home 
economics interest inventory were secured from 
data collected in a previous study of interests 
of home economics students? and from general 
Vocational interest inventories in current use. 

The 448 items in the trial form of the inven- 
tory were grouped into three sections. The 
first group of items were activities to which 
reactions were to be indicated on this five-point 
Scale: (1) like very much (2) like (3) indifferent, 
or do not know (4) dislike (5) greatly dislike. 
An example of an activity in this section is 
item 51, “supervise preparation and serving 
of food.” 

The second group of items W 
acteristics and environmental factors to W 
reactions were also to be indicated on a five- 
Point scale: (1) highly desirable ehars a 
of a job (2) desirable (3) not important (a 
Undesirable (5) highly undesirable. ae 
ample in this section is item 345, work in 
Which it is possible to see definite change in 
People,” 

The third section was made up 


as job char- 
hich 


of miscel- 


5 . ked 
neous items combined into series to be rank 
2 ine job charac- 
ai ine Jol h 
teri elson, Esther. A device to determ p a, dents in 
thesis. 


saes influenci llege home economic 
ih nfluencing collegi 9 
teir choice of a major. Unpublished M.S. 


es, Iowa: Towa State College Library, 1947 


in order of preference. Items 359 to 362, for 
example, are: “work with preschool children,” 
“work with grade school children,” “work with 
adolescents,” and “work with adults.” 


Collection of Data 


Fourteen home economics occupations were 
selected for study, criterion groups being made 
up of a sample or a census of persons employed 
in each of these occupations. To secure these 
groups a total of 1,884 inventories was mailed. 
From the 1,799 inventories which presumedly 
reached their destination, 1,175 or 65 per cent 
responses were received. Since the returns for 
four of the occupational groups proved to be 
too few for satisfactory study, the analysis was 
limited to ten occupational groups. To bal- 
ance the number of returns in the ten groups 
and facilitate the computations, one hundred 
returns were selected at random from each of 
eight groups and all of the inventorjes were used 
in the two remaining groups. 

Criterion groups are described briefly in 
Table 1 in which are recorded the number of 
persons, their median age and the median years 
of experience they had had in the occupations 
in which they were employed at the time they 
responded to the inventory. 


Analysis of Data 


Responses to the trial form of the inventory 
of each of the ten groups were tabulated 
separately, using an IBM test scoring machine, 
Chi-square was the statistic selected to deter- 
mine which items differentiated among occupa- 
tional groups. When this statistic is computed 
it is not necessary to assume that responses 
are normally distributed or that the responses 
used form a true scale. F urthermore, chi- 
square was eqully appropriate for items a the 
first two sections in which responses were 
indicated on a five-point scale and for items in 
the last section which were ranked in order of 
preference. 

A technique for analyzing the data was 
desired whereby not only the items which 
differentiate among occupational groups would 
be identified but by which items could be 
ranked for each occupational group according 
to the extent to which they differentiate a 


340 


given group from other home economists. The 
contribution of each occupational group to the 
chi-square value for each item was used as this 
basis for ranking items and to determine for 
which occupational groups the item was dis- 
criminating. The distribution of responses 
which was originally indicated on a five-point 
scale in the first two sections of the inventory 
was broken into two groups to facilitate 
computation of the components of chi-square. 
Items in which the distribution of the combined 
samples was skewed toward the unfavorable 
responses were divided by combining the neu- 
tral responses with the unfavorable responses. 
Items skewed toward the favorable responses 
were divided by combining the neutral re- 
sponses with the two favorable responses. 

The formula selected to compute chi-square 
and the individual contributions of each occu- 
pational group to the chi-square value is one 
that has not been commonly employed. It is: 


— & (ai N2— af Ni)? 
i1 Nı Ne (a; + a;)’ 

n = total number of occupational groups, 

i = occupational groups, 

number of responses of “like” for each 
occupational group, 

a’; = number of responses of “dislike” for each 

occupational group, 
N, = sum of responses of “like,” 
N = sum of responses of “dislike.” 


Il 


= 
ll 


For each item there are ten component parts 
of chi-square, one for each occupational group. 
Table 2 is an example of the results of analyses 
of items. 


Hildegarde Johnson 


Ninety-two per cent of the 448 items in the 
inventory were significant at or beyond the 
5 per cent level and 89 per cent were significant 
at or beyond the 1 per cent level. Therefore, 
most of the items included in the trial form 
are of value in differentiating among these ten 
occupational groups. This would seem to indi- 
cate that the techniques used to find items 
were highly satisfactory. 

Items were selected for ten scoring keys by 
determining which items most successfully 
differentiate each occupational group from 
other home economists, the size of the numerical 
contributions of this group to chi-square values 
forming the most important basis for selection. 
An attempt was also made to balance the num- 
ber of items in the various keys and to balance 
the number of items in each to which persons 
responded mover favorably and to which they 
responded less favorably than did other home 
economists. 

In the first study weights of 4, 3, 2, 1, and 0 
were tentatively assigned to the levels of re- 
sponse to items selected for the scoring keys. 
Further experimentation with methods of 
weighting is planned after which weights will 
be used which result in successful differentia- 
tion among groups with reasonable expenditure 
of time in scoring. 

The Home Economics Interest Inventory and 
Strong’s Vocational Interest Blank for Women 
were compared with respect to the degree of 
differentiation achieved between home econom- 
ics teachers and a base group. The distribu- 
tions of scores on the Home Economics Teacher 
key of the Home Economics Interest Inventory 


Table 1 


Number of Persons, Median Age, and Median Years of Experience of Criterion Groups 


Food product Promotion 
Food service director 


Home service director 
Hospital dietitian 


Median 
No. of Median Years 
Occupation Persons Age Experience 
Extension home economist 100 35.0 5.0 
100 36.2 4.8 
i 100 36.2 5.7 
Group work with young children 100 28.2 4.0 
Home economist in social welfare 100 37.0 5.2 
100 31.3 4.2 
n 100 38.5 12.5 
Journalist or radio home economist 94 35.2 5.3 
Restaurant and tea room manager 69 33.5 7.5 
100 39.8 5.3 


Teacher (high school) 


í 


Development of a Home Economics Interest Inventory 


341 


Table 2 


Chi-square Value and Components of Chi-square for Item 32 


Frequencies 
Upper Lower Components of 
Occupational Groups Level Level hi-square 

Extension home economist 35 65 1.0332 
Food product promotion 44 56 -6737 
Hospital dietitian 28 72 5.9803 
Teacher 44 56 -6737 
Home service director 47 53 2.0542 
Food service director 24 76 10.6408 
Group work with young children 33 67 2.0299 
Journalist or radio home economist 64 30 30.9447 
Home economist in social welfare 51 49 5.0616 
Restaurant and tea room manager 15 54 9.5668 

385 578 68.6589 


Total 


of 100 home economics teachers and 100 home 
economists-in-general were estimated to over- 
lap? 23 per cent, and no women in the home 
economist-in-general group score equal to or 
higher than the mean of home economics 
teachers. 

Strong! estimated that the distributions of 
scores of home economics teachers and women- 
in-general overlap 32 per cent and that 2.3 per 
cent of the women-in-general score equal to or 
higher than the mean of home economics teach- 
ers. An estimate of the percentage of overlap 
of scores on the Strong key between distribu- 
tions of home economics teachers and home 
economists-in-general, rather than women-in- 
general, would be considerably larger than 32 
per cent.’ 


Summary 


Items for a special vocational interest in- 
ventory in the field of home economics were 
selected by analyzing occupations in the field, 
reading vocational guidance literature, inter- 


3 Percentage of overlapping was determined by ex- 
pressing the mean difference in standard deviation units 
and referring to Table 1 in J. W. Tilton. The measure- 
ment of overlapping. J. educ. Psychol., 1937, 28, 658. 

*Strong, E. K. Information on the mean scores of 
Women-in-general on the home economics teacher key 
and overlapping between home economics teachers and 
Peace general, (Private communication, June 1, 

0.) 


ë Strong, E. K. Vocational interests of men and 
women, Stanford University, California: Stanford Uni- 
versity Press. 1943. Chapter 21. 


viewing persons who had had vocational 
experience, analyzing data collected in a pre- 
vious study, and studying general vocational 
interest inventories in current use. 

The tentative form of the Home Economics 
Interest Inventory was mailed to 1,884 profes- 
sional home economists and returned by 1,175 
of them. Responses to individual items by 
persons in ten occupational groups were ana- 
lyzed using a chi-square technique, to determine 
which items differentiated among occupational 
groups. Ninety-two per cent of the 448 items 
were significant at or beyond the 5 per cent 
level, a percentage which indicates that the 
technique used to select items was highly 
satisfactory. Items were selected for ten scor- 
ing keys and weights of 4, 3, 2, 1, and 0 were 
tentatively assigned to the levels of response 
to these items. 

Other investigations in the series of studies 
of which the present one is a part might well 
include: the development of scoring keys for 
home economics occupations not yet studied, 
the development of norms for each occupational 
group a be the responses of another 
sample of professional home eco: i = 
parison of the percentage of ove Ea 
various methods of weighting items are used 
and a longitudinal study of the interests of 
students exploring the possibility of using their 
cls as the basis for developing scoring 
keys. 


Received October 1, 1951. 


Effect of Changing the Number of Item Responses from 
Five to Four in the Same Test 


Ardis Swordes 


State Civil Service Commission, Springfield, Illinois 


During a routine analysis of our Clerical 
Series of 1948 an interesting situation was 
noticed. An analysis had been run on the 
Clerical Series using three groups of 100 papers 
each—those with highest scores, lowest passing 
scores, and lowest scores. 

Answers on all distracters were recorded. 
Most of the test items had five distracters, but 
two sections had four, items 91-100 on Mathe- 
matics, and items 101-110 on Following Written 
Instructions. The results ought to have shown 
a zero response in the fifth space for these 
sections of the test, but instead a surprising 
number of responses appeared in the fifth space. 
These were found to be the result of marks that 
the applicants had put into the fifth space on 
the answer sheet. 

A total count of such errors on these papers 
was then made as shown in Table 1. The total 
of all such marks was 160. 

Since there were too many such marking 
errors to be dismissed as negligible, or the 
result of chance alone, an attempt was made 
to account for them. While the candidates 
were answering the first 90 questions, they 


Table 1 


Total of Marks in the Fifth Space for Items with Four 
Responses in the Clerical Series Analysis 
of May 1948 


Clerk. Clerk Clerk Clerk 


perhaps developed a “set” to make the mark 
for the last answer in the fifth space on the 
answer sheet. When the group of questions 
with four answers was reached, the same “set” 
continued and the last answer, now the fourth 
choice, was still placed in the fifth space. In 
the same way the mark for the next to the 
last answer was placed in the fourth space 
occasionally, although it should have gone into 
the third. As might be expected, more wrong 
marks resulted when the correct answer was 
the fourth than when any of the others was 
correct. A tabulation of the average number 
of errors for each correct distracter showed 
the results given in Table 2. 

These results necessarily include the effect 
of guessing at the right answer and selecting 
the fourth distracter when it was wrong, and 
similar errors. 

There were about two and one-half times as 
many total errors in the Low Pass group as in 
the High Pass group, and more than three 
times as many in the Failed group as in the 
High Pass group for Items 91-100. The High 
Pass group presumably was the most alert and 
intelligent and consequently made the fewest 
errors. The Failed group had the least ability 
to adapt to the change. This seems to penalize 
the low groups unnecessarily for a mechanical 
error. Even one point may be the difference 
between passing and failing. 

In items 91-100 the Clerk Typist I group 


Items 91-100 7" Bist aupist oleae Steno: Totals made the most mistakes. They have the low- 
est qualifications in experience and training, 
High Pass 7 4 0 4 15 and should be protected from situations leading 
Low Pass 17 CG È g 47 to such errors, unless choosing the correct 
one ] Fe p 7 X si position is considered an ability to be measured 
— Ta & w by the test. 
Items 101-110 In view of the foregoing conclusions the 
High Pass = 1 = 4 9 following steps have been taken to reduce the 
ae = = k: = fa undesirable results of using four and five 
Totals 22 25 47  Tesponses in the same examination. As far 45 
possible the same number of choices is use 
342 


oF 
` 


| 


j 
À 


Effect of Changing Number of Item Responses 343 


Table 2 


Average Number of Marks in the Fifth Space for 
Each Correct Response to Questions 91-110 


Average Number of 


Correct Marks in the 
Answer Fifth Space 
(1) 2.0 
(2) 11.5 
(3) 7.0 
(4) 15.5 


throughout the test. Special directions are 
given when it seems necessary to have a group 
of items with a different number of distracters. 
Questions with four answers are grouped to- 


. gether, instead of being alternated with items 


with five choices. Such precautions are espe- 
cially desirable when a written test is being 
Perpared for a position for which the qualifica- 
tions required are comparatively low. 


Summary 


Items with four choices and items with five 
choices were commonly used in the same civil 
service examination. It was discovered that 
a good many applicants were placing a mark 
in the fifth space on the answer sheet when 
they answered the items with four response 
spaces. 

A count of the total number of such marks 
in a written test showed a total that could 
hardly have occurred by chance. Technicians 
concluded that certain precautions should be 
taken to reduce the undesirable results of 
using a different number of distracters in the 
same examination. These included special in- 
structions, reduction of the number of alternate 
groups, and restriction of a varying number of 
choices to the better qualified groups, when 
practicable. 


Received Seplember 28, 1951. 


Some Research Findings with the Wonderlic Personnel Test 


Edward N. Hay 
Edward N. Hay & Associates, Inc., Philadelphia, Pa. 


During 1945 each of 400 young women 
applicants for clerical positions in a large 
organization was given two forms of the 
Wonderlic Personnel Test, Forms D and F. 
This was part of a project to develop a one- 
minute “warm-up” test, known as Test 1.! 
In the course of this work it became apparent 
that there was a consistent difference in diffi- 
culty between these two forms of the Personnel 
Test. 

In order to insure random sampling the four 
test sequences listed below were given in 
rotation. That is, every fourth applicant took 
the tests in order number 1, etc., as follows: 


N Order of Administration of Tests 
ile 100 Test 1, Personnel F, Personnel D 
2. 100 Test 1, Personnel D, Personnel F 
3. 100 Personnel F, Personnel D 
4. 100 Personnel D, Personnel F 


Table 1 shows the “means and sigmas for 
each group of 100 cases for Form D and Form 


1 Hay, E. N. A warm-up test. Personnel Psychol., 
1950, 3, 221-223. 


Table 1 
Difference in Difficulty of Forms D and F 
Order Mp Sigman Mr Sigmar Mp—Mr 
1,FD 24.03 638 23.54 7.04 + 49 
URE 2213 5:52 25.26 5.76 —3.13 
FD 22.95 5.70 22.39 5.80 + .56 
DF 21.61 5.70 24.95 6.21 —3.34 
Aver. 22.68 24.04 —1.36 
Table 2 
Practice Effects 
Order Must Mona Mia— Moxa 
1,FD 23.54 24.03 — 49 
1,DF 22.13 25.26 —3.13 
FD 22.39 22.95 — .56 
DF 21.61 24.95 219 34 
Aver. 22.42 24.30 —1.88 


F. Since the rotating order in which the tests 
were given neutralizes the practice effect, the 
difference of 1.36 in the means is apparently 


Table 3 
Statistical Summary of Changes in Score 
Change in Score J 
16 1 
15 
14 
13 
12 2 
11 1 
10 5 
9 6 
8 8 
7 19 
6 22 
5 20 
4 34 
3 48 
2 55 
1 37 
0 48 
= 31 
= 30 
-3 15 
—4 9 
-5 3 
—6 5 
-7 
—8 
-9 1 
400 
Changes in Score, 1st to 2nd Test 
N % 
Increased 258 64.5% 
Unchanged 48 12.0 
Decreased 94 23.5 
400 100.0% 
Average Score Changes 
N Points 
Increased 258 3.92 
Decreased 94 2.40 
Net Increase 1.965 


344 


| 


> 
tn 


Some Research Findings with the Wonderlic Personnel Test 


a difference in difficulty of the tests themselves. 
In other words, Form F is 1.36 score points 
easier on the average than Form D. This 
difference is more than three times the sigma 
of the difference of the means and is statistically 
significant at the 1% level. Unpublished fig- 
ures gathered by the American Bankers Asso- 
` ciation show the same approximately one point 
difference, in over a thousand cases. 


Practice Effect 


There is a significant practice effect when 
one form of the Personnel Test is given immedi- 
ately following the other. The average 1n- 
crease in score of the second test over the 
first test is the practice effect of the first test. 
Any difference in difficulty of the two tests 
Was neutralized by giving Form D first in half 
the cases and Form F in the other half. The 
results are shown in Table 2 which indicates 
that the average increase in score of the second 
test—the practice effect—is 1.88 raw score 
points, 

To study the practice effect from another 
Point of view, a distribution was made of the 
changes in score between the first and second 
administrations of the Personnel Test. Ap- 


345 


proximately to equalize the difference in 
difficulty of 1.36 points between Forms D and 
F, one point was added to the D scores in each 
case. The distribution of changes in-score is 
shown in Table 3. 


Practice Effect of Test 1 


When Test 1 is not given first, the average 
increase in score of the second Personnel Test 
over the first is 1.95 raw score points. When 
Test 1 is given first, this practice effect is 1.81 
raw score points. This difference of 0.14 score 
points is no doubt due to the practice effect 
of Test 1. In all cases the difference in diffi- 
culty of forms D and F is nuetralized by 
administering Form D first in one half of the 
cases, and Form F in the other half. 

The practice effect of Test 1 is not statis- 
tically significant, perhaps because it is too 
short. It requires 1 minute as compared with 
12 minutes for each of the Personnel Tests. 
Test 1 does, however, seem to serve a useful 
purpose in some cases by relieving nervousness 
and by acquainting the naive subject with the 
test situation. 


Received March 7, 1952. 
Early publication. 


Speed and Accuracy of Reading Arabic and Roman Numerals* 


Dallis K. Perry 


University of Minnesota t 


It is quite obvious that Arabic numerals are 
read much faster and more easily in our 
culture than are Roman numerals. However, 
there has apparently been no previous attempt 
to determine how much faster and more accu- 
rately the Arabic numerals can be read. The 
present study was designed to find out how 
much speed and accuracy of reading are lost 
by the use of Roman numerals. 


Procedure 


Three sets of numbers, selected from a 
random number table, were used. The first 
set contained numbers from one to nine, the 
second set contained numbers from 10 to 49, 
and the third set contained numbers from 50 to 
99. Each set was typed on bond paper in 
pica type, in Arabic and in Roman numerals. 
The numbers were separated by spaces equiva- 
lent to two digits or letters. 

The numbers were presented to subjects 
with the order of Arabic and Roman sets 
alternated in the following pattern: 


The subjects were allowed one minute for 
each set and were told to read aloud as fast 
and as accurately as possible until told to stop. - 
The experimenter timed each reading and 
marked errors and the total number read on a 


check sheet. 
The subjects were 30 senior and graduate 


students at the University of Minnesota. No 
attempt was made to obtain a random sample 
of any particular group, but there is no reason 
to believe that these subjects would have a 
greater familiarity with either type of numeral 
than any other group. 

The mean number of each group of numbers 
read per minute and the mean number of errors 
per number read were calculated for both types 
of numerals. The significance of the differ- 
ences between the means were tested, and the 
percentages of increase in speed and decrease in 
errors for Arabic over Roman numerals were 
calculated. 


Results 
Speed of Reading. Table 1 summarizes the 


Numbers : 

Subject ia 10-49 50-99 measurements of speed of reading the two 

1 AR RA AR types of numerals. The differences in favor 

2 RA AR RA of Arabic numerals were significant at the one 

: AR RA AR per cent level for all three sets of numbers- 
etc. 


*The writer expresses grateful acknowledgment to 
Profs. Donald G. Paterson and Miles A. Tinker for 
suggestion of the problem and for assistance in designing 

a Pn i 

Edwards, A. L. Experimental design in psycho- 
logical research. New York: Rineh: 4 
se cht AN inehart and Company, 


The absolute difference in number of numbers 
read per minute ranges from 61.4 for the 
single digit set to 85.0 for the 50-99 set. These 
differences represent 50.1, 137.5, and 349.4 pet 
cent more numbers read per minute in Arabic 
than in Roman numerals. 


Table 1 


Differences and Significances of Differences in Reading Speed* of Arabic and 
Roman Numerals (N = 30) 


Arabic Roman Significance Level 
Digi Percentage 
gits Mean S.D. Mean S.D. Difference P 
1-9 183.9 31.2 1225 16.2 50.1 01 
10-49 115.7 19.8 40.3 9.4 137.5 01 
50-99 109.4 18.7 24.4 8.2 349.4 .01 } 


* Numbers per minute. 
346 


Reading Arabic and Roman Numerals 


347 


Table 2 


Differences and Significances of Differences in Accuracy* of Reading Arabic and 
Roman Numerals (N = 30) 


Arabic Roman Significance Level 
Percentage 
Digits Mean SD. Mean S.D. Difference "Pp 
1-9 a 3 A 9 75.0 05 
10-49 3 Fi 8.4 10.4 96.4 01 
50-99 3 9 10.2 9.0 97.1 01 


* Errors per 100 numbers read. 


Accuracy. Table 2 summarizes the meas- 
urements of errors in reading Arabic and Roman 
numerals. The number of errors per 100 num- 
bers read is significantly greater at the five 
per cent level for Roman than for Arabic single 
digit numbers. The differences in favor of 
Arabic numerals for the two sets of larger 
numbers were significant at the one per cent 
level. The absolute difference in errors per 
100 numbers ranges from .3 for single digit 
numbers to 9.9 for the 50-99 set. These 
differences represent 75.0, 96.4, and 97.1 per 
cent fewer errors made with Arabic than with 
Roman numerals. 


Discussion 


absolute and relative differences 
ee aes Arabic numerals increase as the 
numbers get larger is not surprising; and it is 
probable that they would be even greater for 
larger numbers, which in the Roman system 
not only get more complex but add new sym- 
bols as well. 

Roman numerals are frequently used for 
chapter and volume numbers, numbers of 
graphs and tables, dates on buildings, movies, 
and books, etc. The results obtained indicate 
that such use slows the reader and may cause 
errors. Such numbers are usually found in 
comparative isolation. Quite possibly they 
would cause even more hesitation and error 
when presented in context than in isolation as 
they were in this experiment. F 

Although the absolute and percentage F 
ferences in speed are substantial, the loss 0} 
time resulting from the slower reading of the 
few Roman numbers ordinarily met with will 
not often be large. Nevertheless, there should 


be more than personal preference or desire for 
variety to justify wasting even small amounts 
of the reader’s time. 

On the other hand, a single error may be 
very important. It may be assumed that, if 
a number is presented to a reader, it is with 
the intention of its being read correctly. The 
results show Roman numerals to be far less 
satisfactory than Arabic in this respect. 


Summary 


1. An experiment was carried out to deter- 
mine how much faster and more accurately 
Arabic numerals are read than Roman numerals. 

2. Thirty college students were asked to read 
as fast and as accurately as possible sets of 
numbers from one to nine, 10 to 49, and 50 to 
99 in both Arabic and Roman numerals, and 
measurements of speed and errors were taken. 

3. Percentages of increase in speed of reading 
for Arabic over Roman numerals were 50.1 per 
cent for numbers from one to nine, 137.5 per 
cent for numbers from 10 to 49, and 349.4 
per cent for numbers from 50 to 99, All 
differences were significant at the one per cent 
level. 

4. Percentages of decrease in errors for 
Arabic from Roman numerals were 75 per cent 
for numbers from one to nine, 96.4 per cent 
for numbers from 10 to 49, and 97.1 per cent 
for numbers from 50 to 99, The first differ- 
ence was significant at the five per cent level 
and the last two at the one per cent level. 

5. Use of Arabic rather than Roman nu- 
merals would seem to be desirable for most 
purposes because the former are read faster and 
more accurately. 


Received November 5, 1951, 


The Dimensional Analysis of Motion: IV. Transfer Effects 
and Direction of Movement! ‘ 


Patricia von Trebra and Karl U. Smith 


University of Wisconsin 


The dimensional analysis of human motions 
is based on three main principles which repre- 
sent a departure from the current concepts of 
investigation of motor coordination, especially 
those identified with the study of psychomotor 
skills and of time and motion analysis. These 
three principles are: (1) separate measurement 
of the unrelated component movements of a 
motion pattern; (2) development and use of 
pre-planned performance situations that per- 
mit explicit quantitative control of the reactive 
dimensions of motion; and (3) investigation of 
the determining variables affecting motion in 
relation to known and specifiable quantitative 
reactive dimensions that identify the movement 
pattern. New techniques, to be described 
below, permit implementation of these princi- 
ples for the purposes of the present study. 

In the investigation to be reported here, a 
dimensional study of transfer of training has 
been conducted. The degree of transfer from 
one pattern of motion to others has been inves- 
tigated relative to controlled change in the 
direction of movement in the motion pattern. 
The transfer effects measured cover the manip- 
ulative and travel components of movement 
in the motion investigated. At the same time, 
information has been obtained concerning the 
nature of learning different directions of manual 
motions. 


Methods 


1. Apparatus. The main objectives of the 
present study have been to measure separately 
the unrelated travel and manipulative compo- 
nents of a complex motion pattern when the 
subject is trained in a particular direction of 
movement and is thereafter required to per- 
form the same pattern of movement in different 
directions of motion. 

In order to segregate the travel and manipu- 
lative components of the motion pattern for 
measurement, the Universal Motion Analyzer 


1 This research has been supported by funds granted 
by the Graduate School Research Committee, TNE 
sity of Wisconsin. 


(2), shown diagrammatically in Figure 1, has 
been used. This device consists of a special 
electronic relay circuit that will measure auto- 
matically the duration of different components 
of motion. The relay is connected to the sub- 
ject, to the work situation, and to two precision 
time clocks. The device operates in such a 
way that the subject’s contact with objects or 
controls to be manipulated acts to key the 
relay and thus activate the clocks. A dual 
operation is involved in the relay circuits. 
The duration of time in which the subject is 
in contact with the work object or, control is 
registered on the manipulation -time clock. 
The duration of time required to travel between 
two work objects or controls is registered on 
the travel time clock. Time is measured in 
(01 seconds. The clocks used are calibrated 
to a precision of .005 seconds. n i 

Experimental control of the reactive dimen- 
sions of movement is accomplished by means 0 
pre-planned performance situations that per- 
mit quantitative variation of the reactive space 
dimensions of the motion pattern. The per- 
formance situation used here is a large control 
panel equipped with turn-switches, as show? 
in Figure 1. These switches, and the contro 
panel on which they are arranged, are designe 
so that the direction of motion, the pattern ° 
manipulation, the plane of movement, the dis- 
tance of travel, the complexity of movement 
and other dimensions defining the motion pat- 
tern may be controlled as desired. P 

In order to vary the direction of motion, 
four patterns of movement shown in Figure 


the 


PRE-PLANNED 


MANIPULATION W 
TIME 


Fic. 1. Diagram of the pre-planned performance? 
situation with the Universal Motion Analyze". 


348 


. aright-up motion (Pattern IV). 


Dimensional Analysis of Motion 


o-oo 


. 
of0 fo 


20 >0 o 
4 iN 
o o 
V T 
o o 
IZ 
m goeco -20-20 


40->0>030-30 


3ofofocoeo 


A 


of of o0¢oco 
0 -> 0o -> 0 >00 


Fic. 2. The patterns of motion used in the transfer tests (2A) and in the pretraining tests (2B). 


used. The small circles represent the 
TA operated, and the arrows indi- 
cate the direction of motion between eee 
Pattern 1 is a right-down movement. EN 
other three patterns are a left-up mona par 
tern II), a left-down motion (Eara, e 

4 t is identical except for the direc- 
stare ravel within the total motion. There 
are nine manipulations in each pattern. ki 

As already described, the subjects of this 
experiment were required to practice one pe 
tern of motion and were thereafter ba in 
their performance on other directions of ae, 
ment. To secure an appropriate experimenta 
design of the observations, the subjects used 
were divided into four matched groups on the 
basis of their level of performance in a pretest 
with four different patterns of motion. The 
four motion patterns used in this pretest are 
indicated in Figure 2B. During the pretest, 
and in the main observations, the pattern of 
motion to be performed was indicated by a 
lined guide marked out on the instrument panel. 
All the main performances were carried out 
with the right hand. . 

2. Experimental Design. The subjects Lo 
sisted of 48 right-handed college students, bot 
male and female. In the procedural desin, 
number of training days was held constant “2 
all subjects. After training on a preo a 
pattern of movement, the subjects were ae 
tested on the other three patterns of mo a 
and the presentation of these transfer patter 


was randomized by subjects. The effects of 
individual differences were kept at a minimum 
by the use of four matched groups, which were 
formed in terms of performance on the pretest 
motion patterns. The experimental design is 
summarized in Table 1. Details of this design 
may be indicated as follows: 


a. In order to get four matched groups, all 
subjects were pretrained over a period of 4 
trials on each of the four pretraining patterns 
described in Figure 2B. These patterns were 
selected because they approximated the direc- 
tional characteristics involved in the experi- 
mental work situation without the directional 
complications of the training patterns. Each 
subject performed the entire 16-trial sequence 
in random order, first with his right hand, then 
with his left hand. 

b. To provide training over an extended 
period of time, each group was trained in one 
of the four training patterns. Each subject 
received 4 trials per day for 8 successive days 
on his particular group pattern. 

c. To obtain a measure of transfer effect, 
each subject performed one trial on each of the 
four movement patterns, including the pattern 
on which he had been trained, on the day 
following completion of training. There were 
four patterns of transfer order which were 
randomized by subjects within each group so 
that each pattern of 4 trials was performed by 
3 of the 12 subjects in each group. 


349 


350 Patricia von Trebra and Karl U. Smith 
Table 1 
Experimental Design 
Pretraining Training Sub. 
‘Trials Conditions? No. Training Days? Transfer Conditions* 
14 2 3 1 12s 25 6 78 A L I meN 
a3 2 i B iy i Ii y 
3.2414 j C am T 
2134 12 ae eaa = a Do gin chy 
13 
Ir : 
24 
25 
m : 
36 
37 
IV : 
48 So BS ace E 2 


1 This sequence of trials, performed with both right and left hand, involves performance on four patterns, 


randomized as shown. 


The same sequence was used for all subjects. 


2 Subjects were divided into four matched groups on the basis of the mean right-hand performance time in the 


pretraining. 


3 Four trials were given each day with the right hand, making 32 practice trials in all. 


4 The four different orders were randomized by subjects. 


conditions indicated. 


Results 


The results of this experiment may be divided 
into three general parts for presentation. 
Learning effects found in the experiment will 
be described first, then the data related to the 
transfer observations. Finally, some special 


observations related to transfer effects will be 
noted. 


1. Learning Effects. During the eight days’ 
practice by the four different groups on each 
of their respective patterns of motion, learning 
differences were found for different directional 
patterns and for the two component aspects 
of motion measured. Although these data 
have not been subjected to detailed statistical 
analysis, it may be noted that the manipulation 
component of the motion pattern showed a 
progressive learning effect throughout the 
eight days’ practice for all four directions of 
motion. In contrast, the travel movements 


Each subject performs in one series of trials the 


showed inconsistent changes related to learn- 
ing. There is no significant learning effect for 
travel movements on Pattern IV by Group IV- 
This is the right-up pattern of movement. 
The over-all change in the travel component 
of the movements due to practice is very sma’ 
for those patterns giving a learning effect at all. 
This change amounts roughly to nine per cent 
decrease in time over the eight days. The 
mean decrease in duration of manipulation 
due to practice is around fifteen per cent. The 
learning effect on the travel components of the 
movements takes place over the first three 
or four days, if it occurs at all, whereas the 
effect of practice on manipulation is unifor™ 
and continuous over all eight days. 

The relative efficiency of performance of the 
different groups varies during training, al- 
though the differences are not large. Manip- 
lative performance by Groups I and ITI on the?! 
respective patterns is inferior to that of Groups 


‘least efficient. 


Dimensional Analysis of Motion 351 


II and IV on their learning patterns. The 
most efficient of all groups in manipulation 
during training is Group II on Pattern I, the 
left-fip motion. 

In terms of performance of the travel com- 
ponent of movement during learning, Groups 
I and III on their respective patterns are the 
most efficient, and Groups II and IV are the 
Thus, the group and patterns 
showing consistently the fastest manipulative 
motions during training display slower travel 
motions. i 

Transfer Effects. Four separate analyses of 
variance were computed for the data concerning 
performance of the two hands and the manipu- 
lation and travel data for each hand.? These 
analyses brought out two main facts. Differ- 
ences due to training groups were shown in the 
transfer tests to have no statistical significance 
at all. This statistical test may be said to 
indicate, among other things, that the initial 
matching of the training groups was successful. 
However, differences in performance due to the 
transfer patterns were significant in all four 
analyses. That is to say, both manipulation 
and travel time data for each hand show sig- 
nificant variation as a result of the presentation 
of different patterns of movement during trans- 
fer. These differences due to the effects of 
transfer were significant at the .01 point in all 
but the analysis of variance in left-hand travel 
time. In this case, transfer patterns produced 
differences which were significant at the .05 
point. 

The effect of order of performing the different 
patterns of movement in the transfer test is also 
significant in this experiment. As mentioned 
before, this order was randomized for subjects 
within each group. 

The actual phenomena of transfer of learned 
response observed in this experiment may be 
described by considering the degree to which 
performance on a given pattern of movement 
will be changed, either positively or negatively, 
by training upon other patterns of motion. To 


* The critical figures of this experiment and the sum- 
maries of the analyses of RNS are on file with the 
S0 tron Documentation Institute. Order Document 
Si mem American Documentation Institute, 1719 N 

iE film (ir Washington 6, D. C., remitting $1.00 for 
Hots m (images 1 inch high on standard 35 mm. 
motion picture film) or $1.00 for photocopies 6X 8 
inches) readable without optical aid. 


express this change quantitatively, the differ- 
ence between initial performance on a given 
pattern and the performance on that pattern 
during the transfer test has been computed. 
To obtain a “transfer index,” this difference 
is then divided by the value representing the 
initial level of performance and the ratio 
multiplied by 100. 

Figure 3 summarizes the mean transfer index 
for the four different transfer patterns. These 
values are described in a line graph even though 
they do not represent any sort of continuous 
function. This graph is obtained by averaging 
the three transfer indices of three different 
patterns of movement on which different 
subject groups were trained with reference to 
a particular pattern to which these three train- 
ing groups transferred. The graph may be 
interpreted to indicate which of the four pat- 
terns of movement gave the greatest amount 
of positive or negative transfer for manipulation 
and travel components of the motion pattern 


TRANSFEP- 
INDEX 


CY 


Omma 14/4N/PUL ATION 
O=O RAVEL 


POSITIVE TRANSFER 
N A 


ZERO TRANSFER 


NEGATIVE, TRANSFER 
a ES % iS} 


' 
© 


TRANSFER PATTERN 


Fic. 3. The variation i 
ne va in the level of transfer from 
eee Ss directional patterns of motion to a given ` 
irectional pattern for manipulation (solid line) and 


travel movements (o i 
trav ) pen line). Level of i 
indicated in terms of an ae ae 


352 


when training has been carried out in three 
other different directional patterns. 

Figure 3 shows that the manipulation com- 
ponent of the movement pattern displays posi- 
tive transfer effects on all four transfer patterns 
used. This means that practice on any of the 
training patterns will produce an improvement 
in performance in manipulation on any one of the 
transfer patterns. In contrast to this, training 
on given patterns of movement produces gener- 
ally a decrement in performance or a negative 
transfer effect for the travel component of the 
motion pattern. Transfer to Pattern II is the 
only pattern of the four used which displays a 
positive transfer effect for the travel component 
of motion. 

For manipulation, the pattern showing the 
largest transfer effect in terms of these mean 
values is Pattern III. In Pattern III, the 
subject moves his arm down and to the left. 

These data on transfer effects may also be 
described by defining the training patterns 
which produce the greatest amounts of positive 
or negative transfer to other patterns of move- 
ment. These relations are very complex and 
difficult to describe in the limited space avail- 
able. When the data are viewed in these 
terms, however, it is seen again that the trans- 
fer effects on the two different components of 
Movement are quite distinct. The transfer 
effects for manipulation are generally positive 
for different training patterns, whereas these 
effects for travel are generally negative. Pat- 
tern II, in which the direction of movement is 
up and to the left, generally produces a higher 
Positive transfer effect in manipulation to the 
other three patterns. The travel component 
of motion shows the highest negative transfer 
effects in the case of transfer from Pattern IV 
to Patterns I and III. In Pattern IV the 
subject Moves up and to the right. The most 
consistent negative-transfer effects are found 
in transfer from Pattern II to all other patterns. 

The discussion above has centered around 
the relative level of performance of the right 
hand in the transfer tests as compared to initial 
performance with this hand. The design of 
this experiment also permitted some estimation 
of the relative level of proficiency of the left- 
hand performance in manipulation and travel 
on the different transfer patterns, 

Bilateral transfer from right hand to left 


Patricia von Trebra and Karl U. Smith 


hand, when training covers all different pat- 
terns, produces the most rapid performance in 
manipulation on Patterns II and III. Similar 
transfer effects in the travel component of 


movement produces the most rapid perform- A 

ance on Pattern III. į 
The training patterns of the right hand 

associated with the most rapid performance in i 


transfer of manipulation to the left hand are À 
Patterns II and III. In travel movements, 

training on Pattern III produces the mos‘ — | 
efficient performance on all transfer patter 
with the left hand. ; 

In addition to the above observations, s 
data were also obtained on the persister 
transfer effects. These observations were n 
by repeating the transfer tests on Group i 
an interval of one week after the initial transfe: 
tests. The relative and absolute levels of per- 
formance on different transfer patterns, and on 
the training patterns for this group, showed no 
marked changes in manipulation or travel after 
this one-week interval. In other words, the 
effects of learning a particular pattern of mo- 
tion upon performance in other directions of 
movement apparently persist without signifi- 
cant alteration over an interval of at least one 
week. 


Summary 


A new systematic point of view, along with 
pertinent specific principles, has been described 
for the applied and theoretical study of motion. 
Special new techniques, implementing these 
principles, have been used in this investigation 
of transfer of learned movements from one 
direction of motion to others. 

Learning data indicate that, for different 
directional patterns of motion, the travel com- 
ponent on the pattern shows limited and incon- 
sistent changes with practice, whereas the 
manipulative component of the pattern dis- 
plays uniform progressive change throughout 
the eight days of training. These data are 
consistent with earlier findings in showing that 
learning effects for different component move- 
ments in a motion pattern are quite distinct 
a, 3). 

Transfer of response from one directional 
pattern of motion to others presents certain 
striking features when examined in terms of the 
component movements of the motion sequence: 


Dimensional Analysis of Motion 


Manipulative movements show generally a 
positive transfer effect. Travel movements 
display generally negative transfer effects. 
Significant effects exist between the directional 
change and both components of motion even 
though the direction of travel is the only 
dimension of motion which is altered experi- 
mentally. 

Generally speaking, some directions of mo- 
tion show a greater transfer effect than others 
when training is carried out on the other 
different directions of motion. Similarly, the 
degree of transfer effect from a given training 
pattern varies with the direction of response. 
All of these effects are specific to the component 
of movement under consideration. 

Bilateral transfer effects occur. Generally 
speaking, there are two patterns of motion, 
among the four studied, which are associated 
with the high levels of efficiency by the left 
hand when transfer occurs bilaterally. 

Transfer effects, as described, persist, rela- 
tively unaltered, over an interval of one week's 
time. 

The phenomenon of transfer of training is 
a fundamental problem in the applied psychol- 
ogy of training, training equipment design, 


_ instrumentation, and planning of work. Here- 


tofore, general notions about transfer effects in 


353 


these fields have been guided by general theo- 
retical points of view identified by the terms, 
identical elements, transposition and response 
induction. The results presented here not only 
define the limitations in application of such 
generalized ideas but also prescribe the sys- 
tematic empirical orientation necessary to 
specify and predict quantitatively the detailed 
phenomena of motor coordination where the 
human subject must change a learned mode 
of response. Technical developments in the 
quantitative measurement of component move- 

ments in the motion pattern and in the design 

of pre-planned quantitatively controlled task 

situations are fundamental to this empirical 

approach to the study of the psychophysical, 

psychophysiological, and dynamic phenomena 

of motor coordination. 


Received October 9, 1951. 


References 


1. Rubin, G., von Trebra, Patricia J., and Smith, K. U, 
Dimensional analysis of motion. III: Complex- 
ity of movement pattern. J. appl. Psychol., 
1952, 36, 272-276. 

2. Smith, K. U., and Wehrkamp, R. A. Universal 
motion analyzer applied to psychomotor per- 
formance. Science, 1951, 113, 242-244, 

3. Wehrkamp, R. A., and Smith, K. U. Dimensional 
analysis of motion.’ II: Travel distance effects. 
J. appl. Psychol., 1936, 36, 201-206, 


Book Reviews 


Rose, Arnold M. Union solidarity. Minne- 
apolis: Univ. Minnesota Press, 1952. Pp. 
209. $3.00. 


Union Solidarity is an unusual work in many 
respects. It presents one of the few research 
efforts concerning the internal workings of 
unions. Moreover, the research was initiated 
by a union, a type of organization which has 
been very reluctant to bare its intimacies to 
public survey. But perhaps most surprising 
of all, it reports findings using actual (not 
disguised for purposes of anonymity) names 
and places. 
The title of Dr. Rose’s book may be to some 
a bit misleading. It is not a work of an arm- 
chair academician, theorizing on how the “union 
movement” should operate to be in accord with 
preconceived postulates. Union Solidarity is, 
rather, a descriptive report based upon hard, 
cold data provided by those who know the 
actual significance of unionism best—the union 
members, Dr. Rose’s book does not pretend 
to present the reader with a neat package of 
information with which to understand the 
intricacies of the American Labor Movement’s 
internal workings. 
Union Solidarity will provide the reader, 
however, with intimate details about how one 
local union’s membership views its union. It 
will provide him with the possible significance 
of such opinion within a rather broad and 
Somewhat vague conceptual framework of 
“union solidarity.” 
„Briefly, Dr, Rose’s book is based upon mate- 
rial gathered by means of a structured response 
category questionnaire interview. Although 
he used relatively unskilled interviewers (stu- 
dents subjected to a brief training session), 
the possible ill-effects should have been allevi. 
ated to some degree by the design which, for 
. the most part, gave the interviewees pre- 

determined response category alternatives as 
the answers to the questions. In a few 
questions, however, the interviewer and inter- 
viewee were given more freedom, To that 
extent, the reliability of the data is open to 
question. 

Although the results are presented in terms 


of comparative percentages with considerable 
cross comparisons between categories of re- 
sponse, particularly with reference to the 
factual information questions, e.g., age, length 
of union membership, number of meetings 
attended, etc., it is difficult to see how his data 
could be subjected to rigorous statistical analy- 
sis. Purely in the sense of an aside, the 
statistically oriented reader may be somewhat 
puzzled by the use of the term “statistically 
significant” without any indication of the level 
of significance or how such significances were 
derived. l 

Basically, Dr. Rose has presented a descrip- 
tive work. There was little attempt to relate 
the interview findings to membership behavior 
in the union situation. His only attempt was 
to measure certain select categories of interview 
findings against other categories which, on an 
a priori basis, were used as criteria of union 
solidarity. At best, the criteria appear to be 
relatively vague and founded primarily on 
theory, not fact. In general, the study, in and 
of itself, appears to have little predictive 
potential. 

At this point it must be stressed, however, 
that Dr. Rose’s inferences, drawn from the 
results, are notably cautious. He is careful 
to state the limitations of the study, and except 
for the relatively definitely stated criteria of 
union solidarity, does not force the results to 
fit into any preconceived theory. 

To a great extent, the reader is given the 
results, given several tentative explanations, 
and thereafter allowed to forge ahead for 
himself if he is desirous of more definite inter- 
pretations. Dr. Rose does proffer several gen- 
eralizations, both practical and theoretical, 
in his last chapter. However, in keeping with 
the major part of his writing, he makes them 
more in terms of suggestions than hard and 
fast conclusions. 

Even though Union Solidarity is not, and 
does not pretend to be, more than a descriptive 
evaluation of one local union, it will provide 
for those interested, even tangentially, in 
unionism, much Interesting and 


: informative 
data. For the union leader, it should give 


354 


X 
} 


Book Reviews 355 


insight into the potential usefulness of social 
science techniques; for the teacher and student 
of labor and industrial relations, an opportunity 
to get a glimpse into that elusive and hitherto 
relatively unexplored area, the union; and for 
the research worker in the field of unionism, 
insight into the advantages and disadvantages 
inherent in conducting research utilizing a 
comparable technique, as well as comparative 
data against which to relate his findings. 


Hjalmar Rosen 


Institute of Labor and Industrial Relations, 
University of Illinois 


Di Michael, Salvatore G. (Ed.) Vocational 
rehabilitation of the mentally retarded. Wash- 
ington: U. S. Government Printing Office, 
Federal Security Agency, Office of Voca- 
tional Rehabilitation, 1950. Pp. viii + 184. 
45¢. 

This concise yet comprehensive treatment 
of the vocational rehabilitation of the mentally 
retarded is a volume of value to those who are 
concerned with the adjustment of others. It 
is written primarily to meet the needs of the 
professional staff of the State-Federal program 
of Vocational Rehabilitation and is one of a 
series of staff development aids pointed toward 
specific rehabilitation problems. 

The first six chapters of the bulletin cover 
some basic considerations in rehabilitation and 
were written by individuals who have worked 
intensively with the mentally retarded. The 
Jast three chapters, or second part of the bulle- 
tin, contain descriptions of three recent pro- 
grams of rehabilitation of this group. 

Jervis’ chapter on the medical aspects of 
mental deficiency is one of the most succinct 
the reviewer has seen. It will serve as a ready 
reference for an operational understanding of 
the etiology and symptoms of mental retarda- 
tion. Hegge’s point of reference for the deter- 
mination of the feasibility of training a client 
is a healthy one. This determination is made 
on the basis of the client’s ability to benefit 
from the service. The need for careful avoid- 
ance of rule-of-thumb methods of selection and 
the importance of use of individual appraisal 
methods is well emphasized. The implication 
this carries for the necessity of having well 


trained and carefully chosen rehabilitation 
counselors might have been stressed by the 
writer. 

Ina chapter on the counseling of the mentally 
retarded, Yepsen emphasizes the fact that the 
mentally retarded are not qualitatively differ- 
ent from the “‘normal’’ group, but are different 
only in degree. He recommends the usual 
counseling techniques but emphasizes the need 
for working more slowly and for carefully 
appraising the counselee’s understanding of 
each step. If there is any research particularly 
of the kind in which recordings have been made 
of counseling interviews with the mentally 
retarded, reference to it would be valuable. 
If there is none, certainly it is needed for ap- 
praisal of counseling techniques with this 
group. 

The use of job analysis is recommended in 
the chapter on employment. The availability 
of such assistance from the USES might well 
have been mentioned. 

It is gratifying to read the reports on the 
work that has actually been done in the 
rehabilitation of the mentally retarded. The 
emphasis on the job levels at which these 
people can succeed, the need for careful follow- 
up, the aid to the parents, the training in 
personal habits, and the assistance needed to 
establish good relationships with other employ- 
ees are all necessary parts of an integrated 
program. 

Almost anyone who works with people at 
some time encounters the mentally retarded. 
The suggestions found in this bulletin can be of 
invaluable assistance in such instances. 


Vivian H. Hew 
Student Counseling Bureau, F 


University of Minnesota 


Feingold, S. Norman. Scholarships 
ships, and loans. (Vol. II.) RE 
man Publishing Company, I 3 
312. $5.00. a 


Additional information about student aids 
administered by some 245 agencies can be 
found in Vol. IT of Scholarships, Fellowships, and 
Loans. Written for the student who may not 
have the financial resources to continue his 
education—as well as for his counselor—this 
directory lists opportunities not usually found 


356 


in school or college catalogues nor included in 
Vol. I. 

The directory is divided into three parts: 
first, there is a short introductory and editorial 
portion; followed by brief summaries of various 
career aids, some 240 pages in length; and con- 
cluding with a bibliography and three indexes. 

In Part I, the author summarizes the need 
for student aids; at the same time pointing to 
the failure of guidance counselors to utilize 
available resources because they are not gen- 
erally aware of their existence. Local and 
regional sources, particularly, are neglected 
and suggestions and examples are given so that 
the professional worker may uncover them. 

A sample application blank, hints to candi- 
dates for student aid and advice on career 
planning is included in this first part. Directed 
specifically to the student, the material on 
career planning is short and somewhat super- 
ficial. A check list of “things to remember” 
when applying for financial assistance is per- 
tinent, well done and should help the student 
organize his scholarship hunt. 

Page length summaries of available aids 
constitute the subject matter of Part II. 
Alphabetically arranged by sponsoring agency, 
information has been noted on the name and 
address of the resource, sex and qualifications 
of the applicant, amount of funds available, 
special fields of interest and where to obtain 
further information and application blanks. 
Whether the aid is available for undergraduate 
study, graduate work or research is noted in 
each summary. Most of the aids listed are 
sponsored by private organizations, rather than 
schools, and so are not usually found in school 
or college catalogues. 

Part III contains a comprehensive and 
lengthy bibliography which should be of help 
to those interested in setting up a student file. 
Contained also are three indexes designed to 
make the material in the book accessible to 
the reader. The first index covers the intro- 
ductory material. The other two indexes cover 
the student aid material; one is an alpha- 
betically arranged subject index, and the other 
is arranged on the basis of vocational goal and 
field of interest. 

Vol. II is intended as a companion for Vol. I 
and not as a substitute. Some improvements 
have been made in the second volume, notably 


Book Reviews 


that of including more general education aids 
on a regional or local level. The indexing of 
this important material could have been im- 
proved for easier accessibility. č 

The reviewer was impressed with the number 
of aids uncovered by Dr. Feingold, a native of 
Boston, in his own New England region. It 
would be a Herculean task to gather such 
complete information for every geographic area 
and the author makes no claim to having done 
so. He suggests—and counselors can profit 
from his example—the gathering of such infor- 
mation on a local or regional basis. 

Scholarships, Fellowships, and Loans, Vol. 
II, together with Vol. I, offers the counselor a 
comprehensive, though not all-inclusive, survey 
of financial aids available in this country. 
Those working with young people can make 
good use of such a centralized listing. 


Solomon Shapiro 
Jewish Vocational Service, 
St. Paul, Minnesota 


Berdie, R. F. (Ed.) Concepts and programs of 
counseling. Minnesota Studies in Student 
Personnel Work, No. 1. Minneapolis: Uni- 
versity of Minnesota Press, 1951. Pp. 81. 
$1.75. 


This book contains five papers originally 
delivered at the conference for administrators 
of counseling programs, held at the University 
of Minnesota in November, 1950. Despite 
their having originally been written for such a 
group, they should, because of their wide 
implications, thoughtful approach to practical 
problems, and largely non-technical language, 
be of considerable interest to the many people 
whose work brings them into direct or indirect 
contact with professional counseling. 

Included are papers by o. H. Mowrer on 
anxiety theory and the distinction between 
counseling and therapy, F. M. Fletcher con- 
cerning some problems of counseling personnel, 
W. M. Gilbert outlining some suggestions for 
intra-university organizational relationships, 
J. L. Holmes dealing with recent developments 
in counseling, and P. L. Dressel considering 
some issues involved in counseling evaluation. 
Each paper was intended, originally, to form 
the basis for group discussion; together, they 
should provoke much productive thought. 


Book Reviews 


Despite the diversity of topics, the underlying 
unity will be readily apparent, and the book is 
well balanced by the consideration, from differ- 
ent points of view, of many important, practical 
problems. 

It is impossible to consider in detail the 
many highly stimulating points brought out. 
Mowrer’s discussion, in which he proposes a 
distinction between counseling and therapy 
based on learning and personality theory rather 
than bias, is most interesting although it seems 
to imply a dichotomy, normal versus neurotic, 
which is hard to accept. Fletcher’s incisive 
comments on a wide range of problems associ- 
ated with counseling personnel are excellent. 
In large part, he presents such problems with- 
out answers, for there are none, but his relating 
them and opening them up should bring about 
more effort in this direction. Gilbert’s paper 
is most interesting in its emphasis on certain 
practical—and fundamental—issues involved 
in intra-university relationships. His discus- 
sion of the confidentiality of counseling infor- 
mation is particularly worth while, even though 
some may consider his position extreme. 
Holmes’ summary of recent developments high- 
lights the relatively recent emergence of coun- 
seling as a recognizable profession. There 
seems, however, to be too heavy an emphasis 
on projective techniques to the exculsion of 
other equally interesting developments. One 
of the most stimulating papers is that by 
Dressel on evaluation. He not only sees the 
need for such work but also the many difficulties 
besetting those intrepid enough to undertake 
it. His suggestions should be welcomed by 
all whose professional consciences are bothering 
them about the state of affairs in evaluation. 

If there is any over-all criticism, it would 
have to be that too few topics are dealt with 
atrealdepth. But perhaps the very extensive- 
ness of coverage which, perforce, necessitated 
some superficiality is an advantage. For these 
papers are excellent stimuli to thought and 
discussion; the fact that they open up so many 
problems makes them, in this sense, particu- 
larly valuable. Dr. Berdie is to be congratu- 
lated for having planned this conference about 
such worth-while topics; the various writers 
should be commended—and read—for thought- 


357 


ful, practical, and most interesting pieces of 
work. 


John W. Gustad 
University of Maryland 


Keys, A., Brozek, J., Henschel, A., Mickelsen, 
O., and Taylor, H. L. The biology of human 
starvation. Minneapolis: University of Min- 
nesota Press, 1950. Pp. xxxii + 1385, 2 Vol. 
$24.00. 

These two volumes present in systematic 
detail not only the results of the writers’ 
extensive and intensive study of controlled 
semi-starvation but also a comprehensive 
survey of the world’s literature through 1949 
dealing with the general subject. 

The “Minnesota Starvation Experiment” 
consisted of a careful longitudinal study of 
the physical and psychological characteristics 
of 32 conscientious objectors who were sub- 
jected to semi-starvation. Thirty-six subjects 
were selected from 100 volunteers. (Special 
attention is given to the four who did not 
complete the experiment.) The subjects were 
studied throughout a 12-week control period, 
24 weeks of semi-starvation (average loss of 
24% of initial body weight), 12 weeks of 
restricted rehabilitation, 8 weeks of unre- 
stricted rehabilitation and in some cases 
follow-up periods of 8-12 months. 

The book is divided into six sections: 
background, morphology, biochemistry, phys- 
iology, psychology and special problems. 
Each section presents in detail the information 
in the literature and the new experimental 
data, critically reviewing theory and fact,” 
Appendixes on methods, detailed data (119 
pages), wartime diets and rations, and notable 
famines plus references (87 pages) and a general 
index complete the book. 

The importance of the work to those con- 
cerned with human nutrition can hardly be 
overemphasized but it should be noted that it 
is a stimulating source of facts for the psychol- 
ogists as well. The need for understanding 
the psychological side of starvation is recur- 
rently stressed, e.g.“ Two impressions dominate 
the picture; the immense importance of the 
psychological aspect of inanition and the 
comparative simplicity of the nutritional and 
biochemical problem,” page xiii, preface. 

Of particular interest to applied, clinical 


358 


and differential psychologists are the sections 
dealing with changes in body types, glandular 
activity, increased auditory acuity, drastically 
decreased work capacity and reduced sexual 
activity. Data from intelligence tests, the 
MMPI, GAMIN, STDCR, interviews, diaries, 
self ratings and autobiographies give a dramatic 
picture of “semi-starvation neurosis.” This 
personality disturbance, one of the most 
important results on the psychological side, 
was characterized by apathy, irritability, 
introversion, depression and hysteria. It did 
not involve any loss of intellective capacity as 
measured by the CAVD and other tests. 
The remarkable persistence of this neurotic 
pattern far into rehabilitation reveals sharply 
some of the problems to be faced by the 
rehabilitators. 

The scope, coherence and precision of the 
work are impressive. These volumes constitute 
an important reference source for much data, 
new and old, and offer hypotheses and sugges- 
tions which should challenge psychologists. 


: James J. Jenkins 
University of Minnesota 


Jalota, S. Scientific personnel selection proce- 


dure: A study, Hind Art Press, Godowlia, 
India. 


This little monograph describes the personnel 
selection procedures used by the Civil Selection 
Board and the Tata Iron and Steel Co. of 


Book Reviews 


India. Candidates for positions are evaluated 
by psychologists’ judgments of fitness based 
principally upon intelligence tests and person- 
ality measures of various sorts. These judg- 
ments are “validated” against the ratings of a 
final board which considered the psychologists’ 
reports together with other information con- 
cerning the candidates, such as previous 
experience. 

The American psychologist will find little 
to interest him in this monograph either by 
way of methodology or results. He will be 
disappointed to find such non-standardized 
evaluative procedures and primitive validation 
techniques being utilized. The extensive 
tables giving distribution statistics of and 
intercorrelations among ratings serve little 
purpose. Yet the style of writing has such 
charm and the author is so facile at turning a 
phrase that the reviewer could not help being 
intrigued. Certain errors are described as 
negligible because they do not cost the psychol- 
ogist any anxiety or sleepless nights. Refer- 
ence is made to the Rorschach technique as a 
“closed preserve.” Personality inventories 
are said to be of little value because “To ask a 
candidate to condemn himself for a specific job 
is too utopian a hope for its fulfilment in 
modern hard, materialistic times.” One can 
only hope that a writer with such pleasing 
expression will in the future have a message 
of some value to present. 


Edwin E. Ghiselli 
University of California 


| 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Work measurement, new principles and procedures. 
Adam Abruzzi. New York: Columbia University 
Press, 1952. Pp. 290. $6.00. 

Dynamic psychiatry. Franz Alexander and Helen Ross, 
Editors. Chicago: University of Chicago Press, 
1952. Pp. 578. $10.00. 

Child Rorschach responses. Louise Bates Ames et al. 
New York: Paul B. Hoeber, Inc., 1952. Pp. 310. 
$7.50. x 

Management controls in industrial research organizations. 
Robert N. Anthony. Boston: Harvard Business 
School, 1952. Pp. 538. $6.75. 

The Arthur adaptation of the Leiter International Per- 
formance Scale. Grace Arthur. Washington, D. C.: 
The Psychological Service Center Press, 1952. Pp. 
73. $3.00. 

Social psychology. Solomon Asch. New York: Pren- 
tice-Hall, Inc., 1952. Pp. 736. 

Equality by statute. Morroe Berger. New York: Co- 
lumbia University Press, 1952. Pp. 238. $3.25. 
Practical psychology. Revised edition. F. K. Berrien. 
New York: The Macmillan Co., 1952. Pp. 640. 

$5.00. 

Disorganization, personal and social. Herbert A. Bloch. 
New York: Alfred A. Knopf, 1952. Pp. 608. $5.00. 

Marketing research. Ernest S. Bradford. New York: 
McGraw-Hill Book Co., Inc., 1952. Pp.379. $5.00. 

The people elect a president. Angus Campbell and 
Robert L. Kahn. Ann Arbor: University of Michi- 
gan Press, 1952. $1.50. 

The selection of military manpower. Leonard Car- 
michael and Leonard C. Mead, Editors. Washing- 
ton, D. C.: National Research Council, 1952. Pp. 
270. $2.50. 

Cruelty to children, Eustace Chesser. New York: 
Philosophical Library, Inc., 1952. Pp. 159. $3.75. 

How to supervise people. Alfred M. Cooper. New 
York: McGraw-Hill Book Co., 1952. Pp. 254. 
$3.75. 

’Twixt the cup and the lip. Margaret Cussler and Mary 
L. de Give, New York: Twayne Publishers, 1952. 
Pp. 262. $3.95. ree 

Planning and developing the company organization struc- 
ture. Ernest Dale. New York: American Manage- 
ment Association, 1952. Pp. 232. $4.50. 

Elementary school guidance. Ervin Winfred Detjen and 
Mary Ford Detjen. New York: McGraw-Hill Book 
Co., 1952. Pp. 266. $3.75. 

Interracial housing: a psychological evaluation of a so- 
cial experiment. Morton Deutsch and Mary Evans 
on. $ Tote University of Minnesota Press, 

Social psychology. Leonard W. Doob. New York: 
Henry Holt and Co., 1952. Pp, 583. $5.00. 


A dictionary of psychology. James Drever. 
Penguin Books, Inc., 1952. Pp. 315. $.85. 

Measurements of human behavior, Revised edition. 
Edward B. Greene. New York: The Odyssey Press, 
1952. Pp. 790. $4.75. 

Fundamentals of social psychology. Eugene L. Hartley 
and Ruth E. Hartley. New York: Alfred A. Knopf, 
1952. Pp. 740. $5.50. 

Personnel administration and labor relations. Herbert 
G. Heneman and John G. Turnbull. New York: 
Prentice-Hall, Inc., 1952. Pp. 444. $3.95. 

The origin of life and the evolution of living things. Olan 
R. Hyndman. New York: Philosophical Library, 
1952. Pp. 648. $8.75. 

The adolescent and his world. Irene M. Josselyn. New 
York: Family Service Association of America, 1952. 
Pp. 124, $1.75. 

Color in business, science and industry. Deane B. Judd. 
New York: John Wiley and Sons, Inc., 1952. Pp. 
401. $6.50. 

Readings in industrial and business psychology. Harry 
W. Karn and B. von Haller Gilmer. New York: 
McGraw-Hill Book Co., Inc., 1952. Pp. 476. $4.50. 

Personality and conflict in Jamaica. Madeline Kerr. 
Liverpool: University Press of Liverpool, 1952. Pp. 
221. 15s. 

Psychoanalytic explorations in art. Ernst Kris. New 
York: International Universities Press, 1952. Pp. 
358. $7.50. 

Adolescence. Marguerite Malm and Olis G. Jamison. 
New York: McGraw-Hill Book Co., Inc., 1952. Pp. 
512. $5.00. 

AFL altitudes toward production, 1900-1932. Jean 
Trepp McKelvey. Ithaca: New York State School 
of Industrial and Labor Relations, Cornell Univer- 
sity, 1952. Pp. 160. $1.00. 

Your complexes and you. Vitali Negri. Los Angeles: 
American Book Institute, 1952. Pp. 299. $3.50 
Dictionary of mind, matter and morals. Bertrand Rus- 
sell. New York: Philosophical Library, 1952. Pp 

290. $5.00. i 

Understanding ourselves. Helen Shacter. Blooming- 
ton: McKnight and McKnight Publishi 
Pp. 124. $70. g! shing Co., 1952. 

Psychology. Ross Stagner and T. F. Karwoski. New 
York: McGraw-Hill Book Co. > 
$5.00. , Inc., 1952. Pp. 582. 

Measuring your public relations. Herman D. Stein. 
New York: National Publicity Council for Health 
and Welfare Services, 1952. Pp. 48. $1.25. 

Employment of the older worker. Clark Tibbitts, Arthur 
J. Noetzel, Jr., and Charles C. Gibbons. Kalama- 
zoo: W. E. Upjohn Institute for Community Re- 
search, 1952. Pp. 24. Gratis. 


Baltimore: 


359 


Applied statistics, a journal of the Royal Statistical So- 
ciety. Leonard H. C. Tippett, Editor. London: 
_ Oliver and Boyd Ltd., 1952. $4.00 per year. 

y Handwriting. Frank Victor. Springfield: Charles C 
~ Thomas, Publisher, 1952. Pp. 149. $3.75. 

The man on the assembly line. Charles R. Walker and 
Robert H. Guest. Cambridge: Harvard University 
Press, 1952. Pp. 180. $3.25. 

The clinical method in psychology. Robert I. Watson. 
ay York: Harper and Brothers, 1951. Pp. 779. 

00. 

Elements of statistical method. Third edition. Albert 
E. Waugh. New York: McGraw-Hill Book Co., 
Inc., 1952. Pp. 531. $5.50. 


New Book, Monographs, and Pamphlets 


The dream—mirror of conscience. Werner Wolfi. New 
York: Grune and Stratton, Inc., 1952. Pp. 348. 
$8.50. 

Joint consultation in British industry. National Insti- 
tute of Industrial Psychology. London: Staples 
Press Limited, 1952. Pp. 276. 21s. 

Morale—and the prevention and control of panic. New 
York: New York Academy of Medicine and Josiah 
Macy, Jr. Foundation, 1952. Pp. 75. 

The Psychological Corporation 30th Annual Report. 
New York: The Psychological Corporation, 1951. 
Pp. 16. Gratis. 


a = 


Ua 


Journal of Applied Psychology r 


VoL. 36, No. 6 


DECEMBER, 1952 


Unfair Employment Practices As Viewed by Private 
Employment Counselors * 


Vernon Keenan and Willard A. Kerr 


Illinois Institute 


American business and industrial leaders 
make rather free use of the terms “freedom of 
opportunity” and “free enterprise.” These 
terms have psychological meaning and signifi- 
cance; apparently they are generally accepted 
as integral parts of the American way of life. 
Yet, as in all cultures, there are dissonant ele- 
ments and contradictions. Sometimes the 
contradictions seriously weaken the working 
strength of a culture. It is possible that un- 
fair employment practices constitute a major 
inconsistency in the current American pattern. 
“Freedom of opportunity” has many implica- 
tions, and one of the most important is equal 
opportunity to compete for jobs. 

Psychologists should be interested in this 
problem. They alone among the scientists 
specialize in construction and use of the ob- 
jective instruments which are the obvious 
alternatives to consideration of race, creed, and 
ideology in the employment process. Their 
own job opportunities and opportunities for 
service to society are, in fact, restricted accord- 
ing to the extent to which selection and place- 
ment testing is zot institutionalized as a part of 
the industrial culture. This would seem to 
place the real interests of psychologists in fair 
employment practices at a level of professional 
self-interest and, therefore, not merely aca- 
demic or political. 

Examination of the psychological literature 
on the problem shows it to be either extremely 
sparse or only distantly relevant. Aside from 
the literature on attitudes and prejudice in 
social psychology (1, 2), the authors found few 


* This report was 
i prepared from data gathered by 
meas OE 8 a ehate constructed by Vernon Keenan 
student at the Illinois Institute of Technology. 


361 


of Technology 


references specifically on the problem of adult 
employment discrimination. One closely re- 
lated research by Saenger and Gordon (4) on 
504 representative residents of New York City 
indicated lower job satisfaction on the part of 
minority group members and still lower job 
satisfaction on the part of individuals who had 
personally experienced discrimination. 


Employment Counselors 


In speculating as to which occupational 
group has most insight into discrimination in 
employment, it was finally decided that coun- 
selors in private employment agencies were in 
the most strategic positions to obtain and re- 
port such confidential information. This 
seemed true because they handle primarily “the 
white collar” jobs, receive rather specific job 
orders from employers, and typically have a 
wide acquaintance with company employment 
officers. i 

The Sample. Accordingly, a random sample 
of private employment counselors was drawn 
The State of Illinois requires that each such 
counselor register annually with its Depart- 
ment of Labor. The current list for the 
Chicago area contained 738 counselors. Eve 
third name was drawn for an iaeia on 
sample, making a total of 246. Question- 
naires were mailed to these 246 counselors, but 
45 were returned unclaimed by the Postal 
Department. Of the remaining 201, 45 com- 
pleted questionnaires were returned (22 per 
cent). Although anonymous, information 
data show the obtained sample to be repre- 
sentative of the total list as to sex distribution 
(51 per cent male) and age (means: male 33, 


362 Vernon Keenan and Willard A. Kerr 


female 26). Average experience was approxi- 
mately five years. Aside from these facts, it 
was not possible to make a more refined check 
on the representativeness of those responding. 

The Questionnaire! The two-page question- 
naire was preceded by a cover letter which 
stated that “As an employment counselor you 
have had an excellent opportunity to ob- 
serve discrimination in hiring practices and 
because of this we are asking you to assist us 
in determining the extent of such discrimina- 
tion. We are not interested in whether such 
practices are proper or improper. As much as 
you can, then, try to answer the questions as 
objectively as possible. Your frankness and 
cooperation in this survey will be deeply ap- 
preciated. After you have completed the 
questionnaire, please return ‘it in the stamped 
envelope which is enclosed. DO NOT SIGN 
THE QUESTIONNAIRE. If you desire a 
report on the results of this survey, print your 
name and address on the enclosed postal card 
and mail it separately.” 

The questionnaire itself obtained informa- 
tion on age, sex, and counseling experience. It 
also inquired into the predominant occupa- 
tional level of job orders received, typical 
education of clients, and various aspects of 
discrimination. 


Results 


Ethnic Discrimination. As shown in Table 
1 wide differences are reported to exist in the 
difficulty of placing typical individuals of 
various ethnic groups even when all are equally 
qualified. Most difficult to place are Negroes. 
Next come Mexicans and Orientals. Of inter- 
mediate difficulty in placement are persons of 
Italian, Polish, and Slavic descent. Easiest 
to place are persons of Irish, English, and 
German descent. The range of difficulty in 
placement is suggested by the comparative 
facts that four-fifths of the counselors report 
Irish, English, and German applicants easy 
to place, whereas only two per cent report 
Negroes, Mexicans, and Orientals as easy to 


1 The questionnaire has been filed with ti i 
Documentation Institute. Order Dann aie Ea 
American Documentation Institute, 1719 N Street, 
N.W., Washington 6, D. C., remitting $1.00 for micro. 
film a 1 ara high on raia 35 mm. motion 
picture film) or $1. or photocopies (6x8 i 
readable without optical aid. Pies (6X8 inches) 


Table 1 


“How difficult is it to place persons of 
assuming that they are qualified?” * 


descent, 


Average Hard 


Ethnic Always Easily Dif- to Never 

Group Place Place culty Place Place 
1. Irish 31 51 13 — = 
2. English 38 47 9 4 = 
3. German 33 47 16 2 — 
4. Slavic 4 33 51 $ — 
5. Polish 4 33 49 7 = 
6. Italian 2 38 38 20 — 
7. Oriental — 2 22 62 4 
8. Mexican — 2 22 56 11 
9. Negro — 2 2 Ad 40 

* Frequencies of responses are reported as percent- 
ages. Their sums horizontally are not equal to 100 


because of the instances of “no reply.” 
PS 


place. Eighty-four per cent of the counselors 
reported Negroes as “hard to place” or “never 
place” even if qualified. 

The date of Table 1 resemble somewhat the 
Bogardus experience with “social distance” 
data; however, a slightly simpler explanation 
becomes apparent upon examination of the 
critical ratios of successive ethnic group ranks. 
The three easily placed groups (Irish, 
English, and German) are not significantly 
different among themselves in ease of place- 
ment according to these data. A second-level 
cluster (Slavic, Polish, Italian) of ethnic groups 
likewise do not differ significantly from each 
other in ease of placement. This is also trué 
of the third level (Oriental, Mexican). How- 
ever, each group in any level does differ s18- 
nificantly in ease of placement from any group 
in any other level including the fourth level 
(Negro). The homogeneity of variance am 
over-all variance ratio were established 4° 
significant before critical ratios were deter- 
mined. 

Religious Discrimination. The per cent of 
counselors reporting major religious groups as 
“always place” or “easily place” if qualifies 
were as follows: Protestant, 84; Catholic, T3; 
Jewish, 9. No counselor reported either 
Protestants or Catholics as “hard to place” or 
“never place” even if qualified, but 46 per ce” 
stated this to be true of Jews. It seems fro™ 


Eaa: 


AS 


-Unfair Employment Practices 363 


these data, at least in the Chicago area, that 
Jews are as difficult to place as are Orientals. 

Job Requirements. The counselors were 
asked “When arranging an interview for a 
client, how frequently are you asked to supply 
the following information?” The items of in- 
formation follow with the per cent of counse- 
lors reporting “always” or “most of the time”: 
experience, 98; full name, 88; age, 84; personal 
appearance, 60; racial descent, 40; nationality, 
36; address, 26; religion, 25. 

They also were asked “When discussing the 
rejection of an applicant with an employer, how 
frequently are the following mentioned as 
reasons for the rejection?” The reasons follow 
with the per cent of counselors stating either 
“always” or “most of the time”: experience, 91; 
age, 22; appearance, 13; race, 8; religion, 4; 
nationality, 4; address, 2; name, 0. 

It appears from the first answers above that 
discrimination is specifically present in at least 
one-third of the job orders received by the 
counselors. However, this may underesti- 
mate the problem. One counselor appended 
the following typical note to her question- 
naire. “The job order contains two blanks— 
Non-Nordic (Jew-Neg-Onien) & Nordic (Chris- 
tians) thus enabling our counselors to prevent 
the embarrassment of sending a Jew, etc. to a 
Nordic company. So the nationality-religion 
problem never enters as a reason for rejection.” 

The finesse with which discrimination is 
handled is illustrated by the ambiguous use 
of the term “personality” as shown by the 
statement of another counselor. “It is taken 
for granted that in screening applicants for 
employment we will not recommend one 
whose personality or appearance is not accept- 
able. Experience gives counselors knowledge 
of types of applicants acceptable to company’s 
representative, and company’s representative 
a confidence in counselor's recommendations.” 

Size and Extensily of Firm. When asked “In 
your opinion, which of the following is most 
particular about the race or religion of job ap- 
plicants?” the counselor respondents replied as 
follows: small firms, 36; large firms, 33; 
medium-sized firms, 18; no reply, 13. The 
same type of judgment was requested for local 
and national organizations. “In your opinion, 
which of the following is more particular about 


the race or religion of job applicants?” The 
per cents of respondents replying were as fol- 
lows: local, 47; nation-wide, 38; no reply, 16. 
These results are difficult to interpret, but are 
believed not to indicate any extreme unique- 
ness of discrimination as related to size and 
extensity of firm. 

Policy Origin. Who sets discriminatory 
policies? Although this study could not ob- 
tain detailed information on this question, the 
following aspects were investigated. “In 
your opinion, is selection by race and religion 
company policy, the personal policy of the 
personnel director, or the expressed wish of the 
employees?” The per cents of counselors 
giving each cause are as follows: company 
policy, 53; personnel director, 22; employees, 
18; no reply, 7. 

Effect on Counselor. “Do you feel that any 
discrimination that does exist makes your job 
less difficult, more difficult, or about the 
same?” The distribution of replies was as fol- 
lows: more difficult, 67; about the same, 31; 
less difficult, 2. The older and more experi- 
enced counselors were more prone to reply 
“about the same” to this question than were 
the younger counselors who more consistently 
replied “more difficult.” This is interesting, 
because replies to no other question were 
significantly related to age. 


Summary and Conclusions 


A sample of private employment agency 
counselors in the Chicago area was requested 
by mail to cooperate in a study of unfair em- 
ployment practices (22 per cent cooperated). 
The replies of the 45 respondents suggest the 
following tentative conclusions. 


1. According to the reports of these employ- 
ment counselors, the ethnic group which bears 
the severest brunt of job discrimination is the 
Negro; 84 per cent of the counselors report ex- 
treme difficulty in their placement “even if 
qualified.” 

2. The second cluster of ethnic groups in 
order of job discrimination experience includes 
the Mexicans and Orientals. i 

3. The third cluster (less intense discrimina- 
tion) includes Slavs, Poles, and Italians. 


364 


4. The fourth cluster includes Irish, English, 
and Germans. 

5. Religious discrimination is 
largely against the Jews. 

6. Discrimination is reported as specifically 
apparent in at least one-third of the job orders 
received by the private agency counselors. 

7. A majority of counselors responding be- 
lieve unfair employment practices to be de- 
liberate company policy. 

8. Two-thirds of the respondents report 
that the “discrimination that does exist” 
makes their jobs more difficult. 

9. The hypothesis that private employment 
agency counselors can supply critical informa- 
tion about the operation of discrimination in 


directed 


Vernon Keenan and Willard A. Kerr 


employment appears, from these results, to 
be verified. 


Received January 16, 1952. 


References 


1. Krech, D., and Crutchfield, R. S. 
lems of social psychology. 
Hill Book Co., 1948. 

2. Murphy, G., Murphy, L. B., and Newcomb, T. M. 
Experimental social psychology. New York: 
Harper & Bros., 1937. 

3. Nudell, I. G., and Paterson, D. G. Attitudes of 
clerical workers toward three types of employ- 
ment agencies. Personnel, 1950, 26, 330-334. 

4. Saenger, G., and Gordon, N. S. The influence of 
discrimination on minority group members in 
its relation to attempts to combat discrimination. 
J. soc. Psychol., 1950, 31, 95-120. 


Theory and prob- 
New York: McGraw- 


< 


es: To il 


a 


The Accuracy of Application Blank Work Histories 


d James N. Mosel 


The George Washington University 


and 
Lee W. Cozan 


The Hechinger Company, Washington, D. C. 


Although work histories have long been a 
staple item in employment procedures, the 
accuracy with which such data are reported by 
the applicant has received remarkably little 
systematic quantitative investigation. The 
only adequate study is the recent work of 
Keating, Paterson, and Stone (4). These in- 
vestigators studied the accuracy of three work 
history items (wages, length of previous em- 
ployment, and job duties) obtained by coun- 
seling interviews. By correlating worker 
claims with records of previous employers, it 
was found that applicant reports were sur- 
prisingly accurate for both recent and remote 
employment, the correlations ranging from 
.90 to .98. 

The most usual method, however, for ob- 
taining data on previous employment is 
through a written questionnaire—the applica- 
tion blank in industry and Form 57 or “‘unas- 
sembled examination” in government. While 
the literature of personnel psychology contains 
numerous studies of the utility of such data to 
predict job success, the accuracy of the ap- 
plicant’s responses has apparently received 
little attention. It would appear hazardous 
to project the findings of Keating ef al. to the 
application blank inasmuch as these results 
were obtained by interview and in a counseling 
situation where there was presumably little in- 
centive to distort. 

A survey of the literature reveals for the 
most part only “incidents” rather than system- 
atic quantitative investigation of this ques- 
tion. The cases mentioned by Moore (6) are 
conventionally cited in this connection. In 
the army during World War I, it was found 
that only six per cent of those claiming to be 
skilled ma particular trade actually had ade- 
quate skill, while over 30 per cent proved 


365 


totally inexperienced (“trade bluffers”). The 
U. S. Civil Service Commission reported that 
over 0.3 per cent of the applicants in one year 
were barred for attempting to deceive the Com- 
mission. Among applicants for a position in a 
city department in New York, two per cent 
were found to have criminal records although 
they had sworn in their applications that they 
had none. 

More recently, Lipsett (5) reports that in 
the Second Civil Service Region, 46 per cent of 
the cases subjected to a field investigation re- 
vealed falsification in the application for em- 
ployment. In an unpublished study, Goheen 
and Mosel (2) found that of 69 civil service ap- 
plicants for the job of traffic and transporta- 
tion clerk, about ten per cent submitted inac- 
curate statements concerning previous job 
duties. Seven per cent of these were favorable 
to the applicant, while only three per cent in- 
volved misstatements of major importance. 

With military personnel, however, Harris 
(3) found that only rarely do individuals falsify 
on certain items pertaining to previous psychi- 
atric experiences. In fact, he found a tendency 
to over-report such occurrences. As a conse- 
quence, Conrad and Ellis (1) suggest that 
military personnel may provide personal his- 
tory information with greater accuracy than 
civilians ordinarily do. 

Parten (7) has reviewed the scant literature 
on the test-retest reliability and the accuracy 
(as judged by objective records) of such 
personal data as age, car and telephone owner- 
ship, etc., as obtained in public opinion and 
market research. Correlations were in the 
vicinity of .90 and percentages of agreement 
were generally in the 80’s. 

The significance of the accuracy of applica- 
tion blank work histories is twofold. In the 


366 


first place, accuracy of response imposes a 
limit upon the potential utility of such data in 
predicting job success (assuming that there is 
no correlation between job success and the di- 
rection of distortion). Secondly, the amount 
of distortion would be an important determi- 
nant of the necessity of a second selection aid— 
the recommendation questionnaire or reference 
check. The primary (and perhaps only) value 
of this device is that it provides verification of 
the assertions in the application blank. Veri- 
fication may contribute to the predictive 
validity of the application blank by providing 
a more adequate basis from which predictions 
could be made. Furthermore, the amount of 
distortion uncovered may in itself serve as a 
revealing consideration in applicant evaluation. 

The purpose of the present investigation is 
to present some much needed evidence on the 
accuracy of work history data obtained by the 
application blank in a setting where distortion 
might be expected to occur, i.e., the usual 
industrial employment situation. In con- 
ducting such a study, it is essential to deter- 
mine not only the frequency of errors but also 
the magnitude. To this end, the technique of 
correlational analysis employed by Keating 
et al. was used. 


James N. Mosel and Lee W. Cozan 


Procedure 


The application blanks studied were those 
submitted over a one-year period to the person- 
nel department of a building supply and 
engineering company. These applications 
were for sales and office positions and included 
the usual information on nature of previous 
experience, dates of employment, weekly 
salary, and the names and addresses of previ- 
ous employers for each job held. The appli- 
cants were not informed that their statements 
would be subjected to verification, although 
it must be assumed that at least some expected 
a reference check to be made. 

Three items of work history were submitted 
to verification: weekly salary, duration of em- 
ployment, and job duties. Verification was 
achieved in a few cases of a local nature by 
telephone check, but the majority required a 
mailed questionnaire or recommendation form. 
The return rate on this questionnaire was ap- 
proximately 95 per cent. Further checks were 
made in many cases through a credit bureau. 
In making these inquiries, the former employer 
was not informed of the applicant’s claims; 
rather he was requested to submit the desired 
information without such knowledge. This 
was done to prevent the former employer from 


“Verifying” applicant claims without actually’ 


Table 1 


Relationship between Claimed and Verified Weekly Wage for 61 Male Applicants on Jobs Held 
0-12 Months Prior to Application for Employment 


r= .93 


Verified Weekly Wage 


Claimed 100 and J 
Wage $10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99 Over Tas 
100-4 1 I z 
90-99 1 2 | a 
80-89 5 | 2 

70-79 | 10 | 1 17 
60-69 3 10 | 2 
50-59 1 ie we | % 
40-49 | , 
30-39 Eai 4 
20-29 | | i= 3 
$10-19 2 2 
Total 2 0 5 7 15 11 11 6 3 1 o 


Accuracy of A pplication Blank Work Histories 


367 


Table 2 


Relationship between 


Claimed and Verified Weekly Wage for 65 Female Applicants on Jobs Held 


0-12 Months Prior to Application for Employment 


’ 
r= 


94 


Verified Weekly Wage 


Claimed 


Wage 40-49 


$20-29 30-39 


50-59 


60-69 70-79 80-89 Total 


80-89 
70-79 
60-69 
50-59 
40-49 
30-39 
$20-29 | 


Total 16 


11 


checking the records and from “giving the ap- 
plicant a break” by refusing to disclose falsifi- 
cations. 

Data on 61 men and 65 women were suffi- 
ciently complete to permit analysis. In order 
to investigate the possibility that memory dis- 
tortion might operate differentially in terms of 
the recency with which the jobs were held, 
accuracy was studied separately according to 
remoteness in time. The time interval cate- 
gories were identical with those adopted by 
Keating et al., i.e a 0-12 month category 
which included all jobs terminated one year 
prior to applying for the present employment; 
a 13-24 months category which included all 
jobs terminated from 13 months to two years 


before applications;and soon. Work histories 
of male and female applicants were analyzed 
separately to uncover any sex differences. 


Results 


Data were analyzed and displayed in the 
same manner as used by Keating ef al. in 
order to afford a comparison of results. 

One-Year Time Interval. A scattergram of 
the relationship between claimed and verified 
weekly wage on jobs held 0-12 months prior to 
application for employment is shown in Table 1 
for males and in Table 2 for females. The 
correlation is .93 for men and .94 for women. 

The scattergrams for duration of employ- 
ment are shown in Tables 3 and 4. The cor- 


Table 3 


Relationship between Claimed and Verified Duration of Employment for 61 Male Applicants on 
Jobs Held 0-12 Months Prior to Application for Employment 


r= 


98 


Verified Duration (in months) 


Claimed 


Duration 50 and 
(in months) 0-9 10-19 20-29 30-39 4049 Over Total 
50 and over 1 | A 8 
40-49 | 3 3 
30-39 | 5 5 
20-29 10 10 
10-19 1 | 10 11 

0-9 | 24 24 
Total 25 10 11 5 3 7 61 


368 James N. Mosel and Lee W. Cozan 


Table 4 


Relationship between Claimed and Verified Duration of Employment for 65 Female Applicants on 
Jobs Held 0-12 Months Prior to Application for Employment 
r= 98 


Verified Duration (in months) 


Claimed = 

Duration 50 and 
(in months) 0-9 10-19 20-29 30-39 40-49 Over Total 
50 and over | 13 | 13 
40-49 2 | 2 4 
30-39 | 2 2 
20-29 2 2 
10-19 12 | 12 

0-9 C 32 | 32 
Total 32 12 2 4 2 13 65 


relation between claimed and verified duration 
is .98 for both male and female applicants. 

Not shown graphically is the relationship 
between claimed and verified job duties. 
Agreement was found in 95 per cent of the 
male cases and in 94 per cent of the female 
cases, 

Longer Time Intervals. The correlation be- 
tween claimed and verified wages for all jobs 
held over 12 months prior to application for 
employment.was .91 for the male cases and 
-90 for the female cases. Table 5 shows the 
correlations for each specific time interval 
category within this longer period. Because 
the number of cases in each interval is so 
small, considerable reservation must be held 
in using these coefficients to estimate the cor- 
relations in the population, Tt is clear, how- 


Table 5 


tween Claimed and Verified Weekly 
le and Female Applicants on Jobs Held 
Intervals Greater than One Year 

to Application for Employment 


Correlations be 
Wage for Mal 
at Time 
Prior 


ever, that in the present sample, there is no 
indication of a relationship between accuracy 
of response and the recency of the employment 
reported, nor of any sex difference. 

The correlation between claimed and veri- 
fied duration of employment for the longer 
time interval was .97 for male applicants and 
.87 for the female applicants. Correlation 
coefficients for each specific time interval are 
shown in Table 6. Here again there appears 
no evidence of decreased accuracy for the less 
recent jobs, nor is there any indication of @ 
sex difference. S 

As for job duties, agreement between appli- 
cant claim and employer report was found in 
84 per cent of the male applicants and in 8/ 
per cent of the female applicants. 


Table 6 


Correlations between Claimed and Verified Durati 
Employment for Male and Female Applicants 0” 
Jobs Held at Time Intervals Greater than One 
Year Prior to Application for Employment 


on of 


Time Interval 
(Job termination date 


Time Interval 


prior to application Men Women (Job termination date Men Women 
rior to application — 

for employment) N a N P p ie e mae D N 7 N r 
= 21 98 1894 13-24 21 99 8 a 
37-48 AES 10 98 25-36 8 99 0 7 
49-60 . 10.89 37-48 7 97 10 4s 
bhava mee 7 95 49-60 4 99 t 

SAONE 90 7 90 61 and over 8 98 7 N 


G 


` 


Accuracy of Application Blank Work Histories ý 369 


Inspection of the above tables brings out 
two additional observations: 1. Of the rela- 
tively few distortions that do occur, the great 
majérity are in the direction of overestimation 
and hence in the applicant’s favor. This was 
not apparent in the data of Keating ef al. 2. 
For the more recent employment, wage claims 
are somewhat more susceptible to distortion 
than are claims concerning duration of em- 
ployment. This difference was not apparent 
for the more remote jobs. These same re- 
lationships also hold for the data of Keating 
etal. The fact that items have a differential 
susceptibility to distortion is further sup- 
ported by the study by Goheen and Mosel (2) 
on traffic and transportation clerks. Ten per 
cent of these applicants gave advantageously 
inaccurate replies to the item “reasons for 
separation”; seven per cent on job duties; 
and one per cent on wages. 


Summary 


The present study was undertaken to deter- 
mine the accuracy of work history data as 
contained in the applications for employment 
submitted to a building supply and engineering 
company. High agreement was found be- 
tween applicant claims and verifications by 
past employers with respect to weekly wages, 
duration of employment, and job duties. All 
correlations except one were .90 or greater. 
Furthermore, there was no evidence of a re- 
lationship between validity of claim and re- 
cency of job; nor was there any apparent sex 
difference. 

There was, however, a difference in the sus- 
ceptibility to distortion of items for the more 
recent jobs. Wages were more subject to mis- 
statement than was duration of employment. 


These results are in very close’ agreement 
with the findings of another study in which the 
same items'were obtained by counseling inter- 
view. It might have been expected a priori 
that the employment situation of the present 
study would have encouraged greater distor- 
tion than. the counseling interview. This, 
however, does not seem to be the case. Ap- 
parently, the implied verification of application 
data in the employment situation is a sufficient 
deterrent to distortion to provide an accuracy 
comparable to that which the counseling situ- 
ation achieves through lack of incentive to 
distort. One point of difference, however, 
was noted. In the present study distortions 
tended markedly in the direction of over- 
estimation, a trend not found in the study of 
the counseling interview. It may well be 
that the two situations differ not so much in 
amount of work history distortion as in the 
direction of distortion. 


Received November 27, 1951. 


References 


1. Conrad, H. S., and Ellis, A. The validity of per- 
sonality inventories in military practice. Psy- 
chol. Bill., 1948, 45, 385-426. 

2. Goheen, H. W., and Mosel, J. N. The accuracy of 
applications for Civil Service employment. Un- 
published study, 1950. 

3. Harris, H. J. The Cornell Selectee Index—an aid 
in psychiatric diagnosis. Annals N. Y. Acad. 
Sci., 1946, 46, 594-603. 

4. Keating, Elizabeth, Paterson, D. G., and Stone, 
C. H. Validity of work histories obtained by 
interview. J. appl. Psychol., 1950, 34, 1-5. ` 

. Lipsett, L. The personal investigation in the selec- 
tion of employees. Pers. Admin., 1946, 9, 23-29. 

6. Moore, H. Psychology for business and industry, 
New York: McGraw-Hill, 1942, p. 71. x 

7. Parten, Mildred. Surveys, polls and samples. New 
York: Harper, 1950, Chap. 16. 


an 


The Relationship between the Predictive Power of Aptitude,Tests 
for Trainability and for Job Proficiency 


C. W. Brown and E. E. Ghiselli 


University of California, Berkeley 


The success of workers has been measured in 
a variety of ways, but it is possible to distin- 
guish at least two broad categories of criteria, 
namely, success in training and degree of job 
proficiency achieved after training. In many 
instances it is difficult to clearly differentiate 
between these two types of criteria. For ex- 
ample, on some jobs the worker receives his 
instruction quite informally, and if his success 
is measured in terms of production any differ- 
entiation between the learning period and the 
post-learning period is purely arbitrary. Fur- 
thermore, for certain jobs longitudinal studies 
of worker proficiency indicate continued im- 
provement over a period of years and, there- 
fore, it is hard to say just where the learning 
period terminates. 

Nevertheless, it is often feasible to make a 
distinction between trainability and level of 
job proficiency as measures of worker success, 
and ordinarily such a differentiation is made. 
Examples of training criteria are grades earned 
in formal training courses and instructors’ or 
supervisors’ ratings of worker progress during 
the learning period. Job proficiency has been 
measured in a variety of ways including speed 
and amount of production, achievement tests, 
supervisors’ ratings, and the like. 

Knowing that at least a rough distinction 
can be made between these two classes of 
criteria, we can consider a further problem. 
Only infrequently is attention paid to whether 
the types of abilities important in learning the 
skills and knowledges required for a job are 
the same as those that are important for 
performance after these skills and abilities 
have been acquired. Undoubtedly, few people 
would hold that they are identical, although 
most would expect considerable communality. 
The question arises: Can measures in one of 
these areas (Jearning or job proficiency) be 
used to predict success in the other area? 

Obviously, the nature of the index utilized 
to measure success in learning and the index 


used to measure job proficiency may differ 
considerably. Thus, in the training of welders 
their improvement might be measured in terms 
of performance on work sample tests, while the 
success on the job itself might be gauged in 
terms of supervisors’ ratings. In this case, an 
aptitude test which forecasts the learning 
criterion to a particular degree might be ex- 
pected to forecast the job proficiency criterion. 
to a somewhat different degree. Nevertheless, 
it would be expected that a test which had no 
validity for the one criterion would at best 
have low validity for the other, and a second 
test that had high validity for the one would 
be at least moderately valid for the other. 

In other words, one would expect to find a 
positive correlation of at least moderate magni- 
tude to hold between the validity coeflicients 
of various tests determined on the basis of 
their predictive power for training and their 
predictive power for proficiency. Indeed, 
knowing that scores on a test are related to 
trainability of workers for a particular job to- 
a particular degree one is very likely t° 
generalize to the conclusion that it has 2P- 
proximately the same validity for predicting 
proficiency, and vice versa. The purpose, g- 
the present paper is to investigate this hypot s 
sis that there is a substantial corresponden 
between the degree to which a test is va" 
the prediction of trainability and the d 
to which it is valid in predicting job 
ciency. 


Methods and Procedure 


The present writers have developed a file oe 
investigations conducted in the United Stat? 
since 1919 relative to the validity of tests. 
this file tests and occupations are classified by 
type. It is possible, then, to obtain @ ae 
mary of the findings concerning the validity 
particular type of test has for a particu a 
occupation either with respect to training 2 
with respect to job proficiency. In summar? 


370 


y 


Relationship between Predictive Power of Aptitude Tests 


ing such a validity we have computed the 
weighted mean validity coefficient through 
Fisher’s z’ transformation. We have con- 
sider¢d the shortcomings of this procedure 
elsewhere.! 

Our approach, then, is to take as the index of 
validity the average of the validity coefficients 
found in previous studies. Thus, ideally, for 
each type of test and for each type of job we 
would have the average validity coefficient in 
the prediction of trainability and the average 
coefficient in the prediction of job success. 
Unfortunately, the data available from pre- 
vious investigations are incomplete and, there- 
fore, we do not have a pair of validity co- 
efficients for each type of test for each type of 
occupational group. Hence, the comparisons 
and relationships given in this report are not 
as complete as would be desirable. 

The classification of tests which we have 
employed is given below. Detailed descrip- 
tions of the various rubrics are to be found in 
the previously mentioned report. 

Intellective: Intelligence, Immediate memory, 
Substitution, and Arithmetic; Speed of Per- 
ception: Number comparison, Name com- 
parison, Cancellation, and Speed of perception; 
Spatial: Pursuit, Location, Tracing, and Spa- 
tial relations; Motor: Tapping, Dotting, Finger 
dexterity, Hand dexterity, and Arm dexterity; 
and Mechanical Principles. 

The categories into which all jobs were 
classified are as follows: Clerical Occupations: 
General clerks, Recording clerks, and Comput- 
ing clerks; Protective Service Occupations; 
Personal Service Occupations; Vehicle Opera- 
tors; Trades and Crafts: Mechanical repairmen, 
Electrical workers, Structural workers, Proc- 
essing workers, Complex machine operators, 
and Machining workers; and Man ipulalive and 
Observational Occupations: Machine tenders, 
Bench workers and assemblers, Inspectors, 
Packers and wrappers, and Gross manual 
workers, 


Results 
All data pertinent to the present problem 
are given in Table 1. The first question con- 
cerns the degree of relationship obtaining be- 
1Ghiselli, E. E., and Brown, C. W. Validity of 


aptitude tests for predicting trainability of workers. 
Personnel Psychol., 1951, 4, 243-260. 


371 


Table 1 


Relationships between Validity Coefficients of Tests 
Used in the Prediction of Trainability and of 
Job Proficiency, together with Median 
Differences between Coefficients 


r Median 
between Difference 
Validity between 


Coeffi- Coeffi- 
N >- cients cients 
Type of Test 
Intellective 35 2l Ai 
Speed of Perception 19 4 13 
Spatial 38 12 .07 
Motor 22 .09 07 
Type of Job 
Clerical 31 -20 12 
Trades and Crafts 68 a15 .11 
Manipulative and 
Observational 21 -22 .10 
All Tests and Jobs 127 AA -10 


tween the validity coefficients for the two types 
of criteria. Taking all tests and jobs together, 
a total of 127 pairs of validity coefficients were 
available. For these cases the coefficient of 
correlation was found to be .17. The second 
column of the table presents coefficients by 
type of test and type of job. It will be noted 
that for these various breakdowns the re- 
lationship is of about the same order. It is 
apparent from these data that the relationship 
between the validity coefficients of tests for 
trainability and their validity for job profi- 
ciency is low. There is only a slight tendency 
for a test that proves useful in predicting 
trainability also to be useful in predicting job 
proficiency, and vice versa. 

In the final column of the table are given the 
medians of the differences between the validity 
coefficients for the two types of criteria. 
Taking all tests and jobs together the median 
difference between the coefficients is .10. 
While at first glance this difference may not be 
considered particularly large, it is to be re- 
membered that the validity coefficients for 
separate tests are seldom found to be over .50. 
Therefore, in general, the median difference is 
about one-fifth of the effective range of validity 
coefficients. Reference to the median differ- 
ences for breakdowns of the individual tests 


372 


and jobs as given in Table 1 indicate that they 
do not vary significantly from the median 
value for the total set of data. 


Discussion 


The findings herein presented indicate that 
only a low relationship exists between the 
validity of tests in the prediction of train- 
ability and in the prediction of job proficiency. 
That is, a test which is found to have high 
validity in the prediction of workers’ capacity 
to learn a job may have relatively little 
validity in predicting how well they will per- 
form on that job after their training has been 
accomplished. These results suggest that 
the abilities important for learning a job may 
differ markedly from those important in the 
maintenance of proficiency on the job. 


C. W. Brown and E. E. Ghiselli 


There are, of course, a number of laboratory 
investigations indicating that the abilities in- 
volved in the early stages of learning a task 
differ from those involved in later stages. It 
seems likely that a similar situation would 
exist with respect to an industrial task. How- 
ever, it would be hazardous to draw this con- 
clusion on the basis of the present analysis 
since the coefficients here utilized are subject 
to some distortion due to such factors as un- 
reliability of criteria and restriction of range 
of talent. In any event it is clear that one 
cannot be at all secure in using a test that has 
been validated against training criteria as a 
basis for forecasting job proficiency, or in 
using a test validated against job proficiency 
criteria as a basis for forecasting success in 
training. 


Received December 6, 1951. 


Validation of a Clinical Approach to the Placement of Engineers 


A Ross Harrison* and Theodore A. Jackson 


Stevenson, Jordon and Harrison, Inc., New York City 


While there have been innumerable validity 
studies which correlate batteries of psycho- 
logical tests with job proficiency criteria and 
while there have also been many attempts to 
predict the achievement of students in en- 
gineering school from psychometric measures, 
the technical literature is singularly unre- 
warding on the subject of predicting job suc- 
cess of engineers in industry. The present 
note is a report on a personnel evaluation 
program with mechanical engineers in the 
Wright Aeronautical Division of the Curtiss- 
Wright Corporation.’ 

Usually articles on the validity of personnel 
selection procedures are primarily statistical 
studies which relate batteries of tests to meas- 
ures or ratings of job adjustment. In recent 
years it is being realized in some quarters that 
a strictly psychometric or statistical approach 
to occupational prognosis is limited because it 
ignores psychological characteristics which, 
while not directly measurable, are neverthe- 
less important and may be evaluated. The 
use of multiple correlational methods with 
professional, sales, and executive personnel is 
less rewarding because the tests that make up 
the batteries deal more with cognitive abilities 
and do not adequately represent personality 
dimensions. Furthermore, such an approach 
neglects not only intrapersonal dynamics but 
also the integration of segmental information 
about the individual which is necessary for 
understanding the probabilities of job success. 
The global or holistic methods of clinical psy- 
chology such as projective techniques and 
depth interviewing allow for greater synthesis 
of relevant data, whether qualitative or 
quantitative, and are coming into wider use in 
personnel selection by some psychologists 
working in the industrial area. The usual 

* Hunter College. 

1 We are grateful to Messrs. G. R. Heil and P. R. 
Jackson of the Engineering Department for expediting 


the study. Mr. J. L. Alt 7 ible for 
‘most of the computations. oo 


objection to the clinical approach is that it is 
not objective and cannot be properly validated 
and hence is not scientifically acceptable. The 
rejoinder is that, while it is true that clinically 
oriented psychologists seem less inclined to 
validate their procedures in a rigorous manner 
and while the clinical methods are admittedly 
subjective, the effect of applying these methods 
can be objectively and empirically validated. 
In a small way we hope, in this article, to 
contribute to the validation of clinical pro- 
cedures in industrial placement. At the same 
time we realize that, because of practical con- 
siderations, our methods of validation are 
lacking in methodological refinement. 


Procedure 


The population was made up mostly of re- 
cent graduate engineers who had not yet been 
assigned to stable positions within the En- 
gineering Department. In the evaluation 
program each engineer was studied by means 
of a personal history form and a clinical inter- 
view as well as with a battery of ability, 
aptitude, interest, and personality tests. Dur- 
ing the course of the interview, which was 
primarily diagnostic in nature, the test findings 
were communicated to the engineer for his in- 
formation and guidance. Test evaluation was 
qualitative with every source of information 
considered in relation to all other relevant 
data. Emphasis was placed on patterned re- 
lationships and not on discrete scores or seg- 
mental behavior. Eventually a confidential 
written report, which included an analysis of 
abilities, interests, personal traits, placement 
recommendations, and developmental poten- 
tialities for each man, was submitted to the 
technical administrators who were responsible 
for assignment. 

The evaluative techniques were as follows: 
Wonderlic Personnel; Shipley-Institute of Liv- 
ing Vocabulary and Abstract Reasoning; Otis 
Arithmetic Reasoning; Bennett Mechanical 


373 


374 Ross Harrison and Theodore A. Jackson 


Comprehension (Form BB); DAT Space Re- 
lations; Strong Vocational Interest Blank; 
Thematic Apperception Test (8 picture abridg- 
ment); personal history form; personality 
questionnaire; and a projective verbal situ- 
ations test. The last two tests and the per- 
sonal history form were custom-made for our 
purposes and are not on the market. The 
Engineering and Physical Science Aptitude 
Test of Moore, Lapp, and Griffin was tried out 
on a small sample but had to be abandoned 
because it did not discriminate sufficiently for 
our highly selected population. Some of the 
other ability tests had ceilings which were 
short of ideal, but none was so low as to pre- 
clude fairly adequate discrimination for practi- 
cal purposes. 

Because many of the personality tests were 
not scoreable and because of the integrative 
nature of our procedures, orthodox correla- 
tional methods could not be employed. A 
further difficulty was that no comprehensive 
merit ratings over an extended period were 
available on the engineers. Hence, it was 
necessary to develop other, if less familiar, 
methods for testing out the validity of our 
approach, 

First method. In the first phase of the 
validation the personnel evaluation reports of 
113 mechanical engineers were compared with 
subsequent job performance. The immediate 
supervisors of the engineers, who with rare 
exceptions had not had occasion to consult the 
reports, were asked to compare the personal 
descriptions and predictions therein with actual 
job behavior as they knew it from daily con- 
tact. Criteria for selecting the reports to be 
validated were twofold: (1) the psychologist’s 
recommendation must have been followed; 
and (2) the supervisor was required to have at 
least three months’ close observation of the 
man being rated. A member of the Engineer- 
ing Department decided whether the recom- 
mendations had been followed or not. i 

The supervisor, after reading the report in 
each case, was requested to check one answer 
for the following question: 


“From your experience with this man, would 
you say that you agree with the descriptions 
and findings contained within the report? 
High degree of agreement ; predominant 
agreement or mostly agree ; slight agree- 
ment ; no agreement pa 


Second method. A complementary ap- 
proach was made to validation by seeking the 
opinions of higher echelon administrators who 
were responsible for placement. These ‘were 
the directors and division managers who had 
had approximately a year and a half experi- 
ence with the evaluation program. There 
were three main questions: 


1. “In your experience with the personnel 
evaluation reports, to what extent have you 
found them generally helpful and accurate? 
Very helpful and accurate ; fairly helpful 
and accurate ; somewhat helpful and accu- 
rate ; not at all helpful nor accurate DR 

2. “Compare the testing program with the 
placement procedures that existed before test- 
ing was initiated. Has the testing program 
added anything useful or improved in any way 
the placement of engineers? The testing pro- 
gram is: A marked improvement in procedure 
; a slight improvement i no improve- 
ment ; would prefer to dispense with it 
and rely on trial and error alone_____.” 

3. “Did your attitude become more or less 
favorably disposed to the personnel evaluation 
reports as you became better acquainted with 
them and made use of them? More favorable 
; less favorable ; unchanged ee! 


In each instance comments were encouraged. 
In addition, there was a question about the 
approximate number of reports that each 
administrator had read. 


Results 


Recommendations. Some interest revolved 
around the question of the degree to which 
directors and division managers followed 
placement recommendations contained in the 
evaluation reports. No useful purpose would 
be served if the psychologists were academi- 
cally correct but were operating In a vacuum 
with their recommendations being largely 
ignored. 

In 28 per cent of the cases the findings with 
regard to disposition were ambiguous, When 
the ambiguities were disregarded and con- 
sideration was given only to those cases where 
a fairly sharp affirmative or negative decision 
was possible, it was found that 78 per cent of 
the recommendations had been followed. Be- 
cause of personnel requirements of different 
groups within the Engineering Department, It 
would not be reasonable to expect that recom- 
mendations could always be followed. 


Validation of a Clinical Approach to the Placement of Engineers 37. 


First method. Table 1 gives the results on 
the extent of agreement between the reports 
and the judgment of first-line supervisors 
based on their experience with the men who 
had been evaluated. Agreement was sub- 
stantial and was greater than had been antici- 
pated from a knowledge of possible attenu- 
ating factors. The criterion itself was subjec- 
tive and hence subject to human error. Even 
when follow-up investigations were carried out 
in cases of evident contradiction, it was fre- 
quently hard to determine whether the error 
lay with the psychological evaluation or with 
the supervisor’s judgment or was due to 
semantic problems of communication or to un- 
usual environmental conditions which might 
obscure the working out of the personal 
qualities reported in the evaluation. There 
can be no doubt, however, that in some in- 
‘stances the evaluations themselves were in 
error. 

Second method. A tabulation of the opinions 
of the directors and division managers is 
presented in Table 2. The results are favor- 
able but less so than for the more direct 
method of validation. 

When the administrators were asked to 
what degree they had found the reports 
helpful and accurate, about three-quarters 
answered in the first two (or favorable) cate- 
gories. No one said that the reports were 
valueless. 

On the second question, which compared 
the evaluation program with the placement 
procedures that existed before testing was 
initiated, practically all the respondents agreed 
that there was some improvement. They 
were evenly divided as to the amount of agree- 
ment—whether “marked” or “slight.” 

In response to the third question, which was 
concerned with attitudinal changes to the 


Table 1 


Agreement between Evaluation Reports and Ratings 
by Supervisors of 113 Engineers 


Degree of Agreement N PerCent 
High degree of agreement 45 39.8 
Predominant agreement or mostly agree 58 51.3 
Slight agreement 9 7.9 
No agreement 1 9 


un 


Table 2 


Percentage Distribution of Ratings by Directors and 
Division Managers on the Personnel 
Evaluation Program 


Question 1. (N = 24) Per Cent 
Very helpful and accurate 16.5 
Fairly helpful and accurate 58.5 
Somewhat helpful and accurate 25 
Not at all helpful and accurate 0 

Question 2. (N = 21) 

A marked improvement in procedure 47.6 
A slight improvement in procedure 47.6 
No improvement 4.8 

Question 3. (N = 25) 

More favorable attitude 64 
Less favorable 8 
Unchanged 28 


program, 64 per cent were more favorably im- 
pressed with added experience, 8 per cent be- 
came more negative, while 28 per cent re- 
ported no change. Of those who reported no 
change, many had favorable attitudes from the 
beginning. 

Information was available on the number 
of reports that each administrator had read 
for assignment purposes. An attempt was 
made to find out if reaction to the program 
was related to familiarity with reports. 
Further statistical computations, which will 
not be reported here, were made to determine 
the relationship. The resulting figures showed 
that for the first two questions those adminis- 
trators who were more familiar with evalu- 
ation reports were also more favorably dis- 
posed to psychological testing, but this same 
relationship did not hold for the third question. 

The solicited comments of the directors and 
division managers were highly diversified and 
showed only one definite trend. The ma- 
jority of the remarks were concerned with 
administrative aspects of the evaluation pro- 
gram rather than with the content of the re- 
ports themselves. Since these remarks were 
frequently critical, this fact served to attenuate 
the validity figures. Critical comments about 
the administrative use to which the reports 
were put somewhat confused the issue, since 
the basic problem was the validity of the 


376 


evaluations, not the way in which they were 
used. Moreover, many of the administrative 
problems were a consequence of the shortage 
of engineers and therefore beyond control. 


Summary 


A clinical approach to the evaluation and 
placement of engineers in the aeronautical 
industry is described. Validation was studied 
in two ways. 


1. First-line supervisors rated the accuracy 


Ross Harrison and Theodore A. Jackson 


of psychological reports on the basis of their 
experience with the men who had been evalu- 
ated. Substantial agreement was obtained 
in over 90 per cent of the cases. 3 

2. Higher echelon administrators, respon- 
sible for assignment, after a year and a half 
experience gave their opinions on the value of 
the program. The results were again favor- 
able but were slightly less so because of ad- 
ministrative complications. 


Received August 4, 1952. 
Early publication. 


Relationship of Masculinity-Femininity to Tests of Mechanical 


and Clerical 


Abilities * 


_ Marilyn C. Lee 


University of Illinois 


The primary purpose of this investigation 
was to study relationships between aptitude 
and personality variables with specific refer- 
ence to mechanical ability, clerical ability, and 
masculinity-femininity. A secondary purpose 
was to explore the possibility of developing a 
disguised and relatively non-fakable measure 
of M-F from clerical and mechanical test 
materials. 

The rationale for this study is based on con- 
sistent findings reported in the literature indi- 
cating that women tend to score higher than 
comparable groups of men on tests of clerical 
aptitude (5, 6, 11), while men perform better 
than women on most tests of mechanical 
ability (3, 10, 13). It should be emphasized 
that these ability differences are reported in 
reference to the biological dichotomy of males 
and females, Are such differences also re- 
flected in the psychological continuum of 
masculinity-femininity as measured by an 
inventory of interests, attitudes, and modes of 
behavior? 

Tests and Subjects 


The tests administered for the purposes of 
this study included the Bennett Test of Me- 
chanical Comprehension, Form AA, the Min- 
nesota Clerical Test, and the Terman-Miles 
Attitude-Interest Analysis Test, Form B. 
The Bennett and Minnesota tests were 
selected because they have repeatedly shown 
substantial sex differences, and are of an ap- 
propriate difficulty level for the sample 
utilized; the Terman-Miles inventory was used 
because it represents one of the most compre- 
hensive and carefully constructed inventories 
yet developed for the measurement of mascu- 
linity-femininity. Since performance on these 
tests is related in varying degrees to perform- 
ance on measures of general ‘academic ability 
(1, 2, 14), available scores on the American 


* This article is based on the writer’: is of 
r writer’s M.A. thesis 0! 
TE ER ag eatit in December, 1950 under the 
) Prof. Donald G. P; d on file in 
the University of Minnesota ee at 


377 


Council on Education (ACE) Test for College 
Freshmen were also utilized in the analysis. 

The experimental group comprised 102 male 
and 72 female students recruited from a sopho- 
more level course in psychology at the Univer- 
sity of Minnesota. Biographical information 
obtained for all subjects from questions pref- 
acing the Terman-Miles test indicated that 
the sample was quite uniform with respect to 
regional background, the great majority being 
local residents. The mean ACE score of the 
group was at the 63rd percentile on national 
college freshman norms. 


Results 


Inter-Sex Differences. Means and standard 
deviations of the four test variables are pre- 
sented separately for the male, female, and 
combined groups in Table 1. These results 
corroborate previous research in showing that 
men as a group perform significantly better 
than women on the Bennett Test of Me- 
chanical Comprehension (C.R.= 10.7), while 
women are significantly superior to men on the 
Minnesota Clerical Test (C.R.=3.6). The 
sharp sex-differentiating power of the Bennett 
test is evidenced by the relatively small over- 
lap (21.7%) of the male and female score 
distributions. Although the Minnesota Cleri- 
cal Test also succeeds in dichotomizing the 
sexes, it is less efficient in this respect than the 
Bennett, as indicated by the larger score dis- 
tribution overlap of 45.9%. The superior 
differentiating ability of ‘the mechanical test 
may be at least partially attributable to its 
greater amenability to cultural influence. 

By way of comparison, the Terman-Miles 
Attitude-Interest Analysis Test separated 
males and females with a percentage overlap 
of only 9.7, an amount comparable to the 8% 
average overlap obtained by the test authors 
for total scores on this test (14). Results for 
the ACE test bore out expectations based on 
previous research (15) that there would be no 


378 


Marilyn C. Lee 


Table 1 


Means and Standard Deviations for the Four Test Variables 


Males Females Total Group 
(N = 102) (W=72) (N = 174) 
Test M S.D. M S.D. M S.D: 
Bennett Mech. Compr. Test, 

Form AA 44.7 8.6 30.2 9.0 38.7 11.3 
Minn. Clerical Test 244.2 46.9 270.5 46.8 255.1 48.6 
Terman-Miles Att.-Int. Anal. Test, p 

Form B +88.9 40.6 —26.9 54.1 +41.0 73.7 
ACE Psychol. Exam.* 87.2 25.0 86.1 25.8 86.8 25:3 


* Scores on the ACE test were available for approximately 80% of the experimental group. N’s for this test 


are as follows: males, 84; females, 58; total group, 142. 


statistically significant differences between the 
sex groups in performance. 

Product moment correlations of the Terman- 
Miles test with each of the three other test 
variables are presented for the total group and 
for each sex separately in Table 2. In refer- 
ring to this table it should be noted that plus 
Scores on the Terman-Miles test indicate 
masculinity and minus scores indicate femi- 
ninity. Thus, positive correlations between 
M-F and other test scores represent a positive 
relationship with masculinity and a negative 
relationship with femininity, the relationships 
being reversed for negative coefficients. 

For the total sample, the correlation of .57 
between the Bennett Mechanical and the 
Terman-Miles tests indicates that of the test 
variables employed the Bennett is the best 
single predictor of Terman-Miles M-F. In- 
deed, for this group the Bennett test predicts 
masculinity-femininity as well or better than it 


Table 2 
Product Moment Correlations of the Terman-Miles 


Attitude-Interest Analysis Test with the 
Ability Test Variables 


Total 
Test Variables Males Females Group 
Terman-Miles M-F and 
Bennett Mech. Compr. .11 24* PY d 
Terman-Miles M-F and 
Minn. Clerical —.09 .05 —,29** 
Terman-Miles M-F and 
ACE Psychol. Exam. .08 .26* 12 


* Significant at 5% level of confidence. 
** Significant at 1% level of confidence. 


predicts various mechanical criteria (4, 8, 9, 
12), and correlates about as high with the 
Terman-Miles test as do many M-F tests with 
each other (7). Thus, if a mixed-sex group is 
used as the frame of reference, the Bennett 
may in a sense be referred to as a disguised 
measure of masculinity-femininity. More- 
over, the Bennett predicts Terman-Miles M-F 
as well when used alone as when supplemented 
by the Minnesota Clerical Test. The ob- 
tained multiple R of .58 indicates that little 
predictive efficiency is gained by using both 
tests. 

The Terman-Miles M-F and Minnesota 
Clerical tests correlate —.22 with each other, 
indicating that for the sexes combined, meas- 
ured femininity is positively related to clerical 
proficiency, but that the relationship is not 
nearly as high as that between masculinity and 
mechanical ability. Since the sex overlap of 
the Minnesota score distributions is more than 
twice that of the Bennett distributions, this 
result might well be expected. One of the 
reasons for the difference in correlation magni- 
tude may be that the Bennett is more heavily 
loaded with experiential and informational 
factors than the Minnesota Clerical Test. Not 
only is the sex difference in means comparatively 
smaller for the Minnesota than for the Bennett, 
but the improvement that takes place with 
age in adolescence is also proportionately 
smaller for the Minnesota test: 9% as com- 
pared with a 23% improvement on the 
Bennett (1, 2). Both of these findings sug- 
gest that experience affects performance to # 
greater degree on the mechanical than on thé 
clerical test. If the posited information loac- 


x 


= 


\ 


Masculinity-Femininity and Mechanical and Clerical Abilities 


ing does exist, then the Bennett test is prob- 
ably in part a measure of interest as well as of 
aptitude. It should therefore be expected to 
corrélate more highly with the M-F inventory, 
which also measures interest factors. 

The obtained correlation of .12 between the 
ACE and the Terman-Miles M-F tests indi- 
cates that intelligence and masculinity-femi- 
ninity are at best negligibly correlated in a 
mixed sample. This finding is consistent with 
the general trend of research which has shown 
little or no difference between comparable 
groups of males and females on standard tests 
of intelligence (16). 

Tt should also be pointed out that although 
the mechanical and clerical tests both correlate 
positively with the ACE test, they correlate 
significantly negatively with each other (—.24). 
This negative correlation is probably due 
mainly to the opposite direction of sex differ- 
ences, for if general ability is partialled out of 
both tests the negative correlation becomes, as 
anticipated, somewhat greater (—.35). How- 
ever, the case of two tests which are related 
positively to general ability and negatively to 
each other is uncommon in the field of psycho- 
metrics, and suggests interesting applications 
of partial correlation and suppression tech- 
niques. 

Intra-Sex Differences. The discussion thus 
far has been concerned with results obtained 
on a mixed sex sample. The inter-sex differ- 
ences obtained for both the Bennett Mechani- 
cal and the Minnesota Clerical tests are sub- 
stantial, and the Bennett test, in particular, is 
a significant predictor of Terman-Miles M-F 
scores. However, it was realized from the 
start that a more convincing demonstration of 
the predictive efficiency of ability test scores 
for the postulated trait of psychological mas- 
culinity-femininity could be obtained from 
correlations reported for each sex separately. 

While the coefficients presented in Table 2 
indicate that Terman-Miles M-F is not signifi- 
cantly associated with measured mechanical or 
clerical abilities within the male group, the ob- 
tained correlations are in the expected di- 
rections. For women, M-F is correlated to a 
low but statistically significant degree with 
both Bennett and ACE scores, but not with 
Minnesota Clerical scores. These results indi- 
cate that mechanical comprehension tends to 
be a masculine characteristic for women, and 


379 


corroborate Terman and Miles’ finding that 
brighter college women tend as a group to rank 
slightly more masculine than their less able 


‘colleagues (14). 


The best prediction of Terman-Miles M-F 
for women is obtained by using both the 
Bennett and ACE tests. Although the ob- 
tained multiple R of .30 is not high enough for 
practical purposes, the data were derived from 
a sample which was very homogeneous with 
regard to age, education, and regional back- 
ground. 

In general, most of the within-sex differ- 
ences are in the same direction as the obtained 
inter-sex differences but are much reduced in 
magnitude. These reductions in correlation 
size are much greater than would be expected 
from attenuation due to restriction of the total 
sample range. Such results suggest either: 
(1) that Bennett Mechanical and Minnesota 
Clerical test scores are not particularly good 
indicators of intra-sex M-F; or (2) that the 
validity of the Terman-Miles test as a meas- 
ure of M-F within each sex should be more 
carefully investigated. 


Summary 


The following major conclusions were de- 
rived from the data obtained in this investi- 
gation: 


1. Men perform significantly better than 
women on the Bennett Test of Mechanical 
Comprehension, Form AA, and women dem- 
onstrate a somewhat less marked superiority 
on the Minnesota Clerical Test. 

2. For a mixed-sex group, the Bennett Me- 
chanical test predicts Terman-Miles M-F as 
well or better than it does most of the me- 
chanical criteria against which it has been 
validated. Moreover, its correlation with the 
Terman-Miles test compares favorably with 
reported intercorrelations of various tests 
specifically designed to measure masculinity- 
femininity. 

3. Although the Minnesota Clerical Test 
predicts Terman-Miles M-F for a mixed-sex 
group at a statistically significant level, it does 
not contribute substantially to the Bennett in 
multiple prediction of masculinity-femininity. 

4. Most of the intra-sex correlations be- 
tween ability test and M-F test scores are 
essentially in the same direction as those ob- 


380 


tained for the total group, but are much re- 
duced in magnitude. These reductions in 
magnitude are in excess of what might be ex- 
pected due to restriction of score range within 
single-sex groups. 

5. The interpretation of the intra-sex cor- 
relations obtained in this study depends in 
part on the validity of the Terman-Miles test 
as a measure of psychological masculinity- 
femininity. It has been repeatedly demon- 
strated that the test is an extremely efficient 
discriminator of biological maleness-female- 
ness (14), In addition, Terman and Miles 
cite case histories of homosexuals who ob- 
tained scores characteristic of the opposite sex 
on the M-F scale (as well as high scores on the 
inversion scales). However, there is no con- 
clusive evidence that the test is measuring 
intra-sex M-F with a high degree of validity 
within the normal range of the population. 
In order to develop such a test it would be 
necessary to employ the additional criterion 
of ratings of psychological M-F using a suffi- 
cient number of competent judges to insure 
adequate reliability. Terman and Miles used 
only a few judges and were discouraged by the 
low reliabilities which they obtained. How- 
ever, by increasing the number of raters to 30 
or 40 through the use of sorority or fraternity 
groups, it should not be too difficult to achieve 
satisfactory reliabilities, as can be readily es- 
timated by application of the Spearman- 
Brown formula. 

6. One of the original purposes of this proj- 
ect was to explore the feasibility of developing 
a disguised measure of masculinity-femi- 
ninity from ability-type tests. As Terman and 
Miles (14) have pointed out, their M-F test is 
easily fakable if subjects are motivated to this 
end, even though the purpose of the test is 
otherwise difficult to discern. A non-fakable 
M-F test would be desirable, especially in 
selection situations, and if the test were of the 
ability type it would have the added virtue of 
providing both personality and ability scores 
for the same expenditure of time. Since it 
has a fairly high correlation with the Terman- 
Miles test within mixed-sex groups, and since 
it is so successful in differentiating the sexes, 
the Bennett Test of Mechanical Compre- 
hension could conceivably serve as the basis 
for developing such a test. It could be con- 


Marilyn C. Lee 


structed by analyzing a large pool of items 
similar to those found in the original Bennett 
and selecting those which most successfully 
differentiate men from women and at the same 
time serve as predictors of mechanical ability- 
As an additional criterion for item selection, 
trait ratings as proposed above should be used 
in order to insure high intra-sex differenti- 
ation. 


Received January 10, 1952. 


References 


1. Andrew, Dorothy M., and Paterson, D. G. Man- 
ual for the Minnesota Clerical Test. (Rev. Ed.) 
New York: Psy. Corp., 1946. i 

2. Bennett, G. K. Manual for the Test of Mechanical 
Comprehension, Form AA. New York: Psy. 
Corp., 1940. = 

3. Bennett, G. K., and Cruikshank, R. M. Sex 
differences in the understanding of mechanical 
problems. J. appl. Psychol., 1942, 26, 121-127. 

4. Bennett, G. K., and Fear, R. A. Mechanical com- 
prehension and dexterity. Personnel J., 1943, 
22, 12-17. 

5. Englehardt, Olga E. The Minnesota Clerical Test: 
sex differences and norms for college groups. 
J. appl. Psychol., 1950, 34, 412-414. 

6. Goodenough, Florence L. The consistency of sex 
differences in mental traits of various ages- 
Psychol. Rev., 1927, 34, 440-462. 

7. Heston, J. C. A comparison of four masculinity- 
femininity scales. Educ. psychol. Measmt., 1948, 
8, 375-387. 

8. McDaniel, J. W., and Reynolds, W. A. A study 
of the use of mechanical aptitude tests for selec- 
tion of trainees for mechanical occupations- 
Educ. psychol. Measmt., 1944, 4, 191-197. 

9. Moore, B. V. Analysis of tests administer 
men in engineering defense training courses 
appl. Psychol., 1941, 25, 619-635. D 

10. Paterson, D. G., Elliott, R, M., Anderson, I ig 

Toops, H. A., and Heidbreder, E- MArie o 
mechanical ability tests. Minneapolis* UE 
Minn. Press, 1930. 

Schneidler, Gertrude G., and Paterson, D- G oe 
differences in clerical aptitude. J. edu- Psy- 
chol., 1942, 33, 303-309. 

Shuman, J. T. Value of aptitude tests for factory 
workers in the aircraft engine and propeller 
industries. J. appl. Psychol., 1945, 29; 156-160. 

13. Stenquist, J. L. Measurements of mechanical abil- 
ity. New York: Teachers College Contributions 
to Education, No: 130, 1923. 

14. Terman, L. M., and Miles, Catherine C. Se* and 

personality. New York: McGraw-Hill, 1936. 
15. Thurstone, L. L., Thurstone, Thelma G., 204 
Adkins, Dorothy C. The 1938 Psychological 
Examination. Educ. Rec., 1939, 20, 263-300. 

16. Tyler, Leona E. The psychology of human difer- 
ences. New York: Appleton, 1947. 


d to 


11. 


12. 


Se 


Ability Patterns in Technical Training Criteria * 


Benjamin Fruchter 


A number of different systems for classi- 
fying occupations into job families have been 
proposed (3, 10, 12, 13). Several investi- 
gators (1, 8) have applied factor analysis 
methods to the problem. The classifications 
have been based largely on the common ele- 
ments such as skills, intelligence, and per- 
sonality traits needed for successful perform- 
ance in the occupations. 3 

The present approach attempts to classify 
occupations on the basis of the profile of apti- 
tudes required of the persons who are candi- 
dates for training. It should be particularly 
applicable to vocational and educational 
guidance and in the Armed Services, where a 
period of training usually precedes assignment 
to an occupation. Occupations which are 
quite different at the level of skilled perform- 
ance may require similar profiles of aptitudes 
of applicants for training. Thus, it might be 
found that candidates for training as weather 
observers and clerk-typists require similar 
aptitudes although the jobs themselves are 
quite different above the training level. 

Both Guilford (5) and Gulliksen (7) have 
stressed the importance of analyzing criteria 
from the factorial approach so that the rela- 
tive contribution of various aptitudes and 
traits to successful performance might be 
evaluated. Scores on a battery of selection 
or classification tests can then be related to 
ultimate performance in one or more occupa- 


tions, and appropriate assignment can be 
5 


made. 
Population and Factor Analysis 


The present exploratory study was done 
with data from the USAF Air Training Com- 
mand. A battery of classification tests 1s 


1 Based on a paper read at the American Psycho- 
logical Association meetings in September, m FA 
though this study was done while the writer was 5 
civilian employee of the Air Training Commanc, E 
views expressed in this article are those of the writer 
and do not necessarily represent the official views of the 
United States Air Force. The writer wishes gratefully 
to acknowledge considerable computational assistance 
from Mr. William L. Grafton. 


381 


The University of Texas 


routinely administered to airmen, and the 
results are used to assign them to technical 
schools for training in various occupational 
specialties. To evaluate the effectiveness of 
the tests in selecting successful candidates for 
training, the scores are validated against the 
final grades made in the technical schools. 

The following two alternative approachés 
suggest themselves for determining the factor 
content of these final-grade criteria: 


1. Consider each technical specialty a sepa- 
rate population and perform a factor analysis 
of the intercorrelations of its criterion and the 
test battery, or, 

2. Consider the airmen a single population, 
perform a factor analysis of the battery on a 
representative sample. Estimate the factor 
content of each training criterion on the basis 
of the factor loadings obtained from the com- 
bined population and the validity of the tests. 


The latter approach was adopted because 
the loadings from specialty to specialty would 
be more comparable and a prohibitive number 
of factor analyses would not be required. 

A representative sample of 389 airmen who 
had been assigned to training in technical 
schools was drawn. Each school was repre- 
sented in the sample in proportion to the 
total number of airmen who had been sent 
into that type of training, and the individuals 
chosen to represent each school were randomly 
selected. 

The students were all male, mostly 18 and 
19 years old, and had on the average a tenth- 
grade education. The intercorrelations of 
the 19 tests administered to them were ob- 
tained from the Quarterly Research Reports 
of the USAF Air Training Command, 3309th 
Research and Development Squadron (11, 
p. 5). Six factors were extracted by the 
centroid method and rotated to psychologically 
meaningful positions. Table 1 gives the ro- 
tated loadings for the 19 variables. The 
orthogonal rotated factors are identified as 
follows: I. Numerical facility (N); II. Verbal 


382 Benjamin Fruchter 


Table 1 


Rotated Factor Loadings for Stratified Sample of Technical School Students 


Sample: 389 airmen assigned to technical schools 
(Decimal points omitted throughout) 


Factor 

Test Variables* I Il iL IV vV VI I? 

1. Reading Vocabulary 41 62 12 14 23 06 64 

2. Arithmetic Computations 72 15 17 04 22 o5 62 

3. Arithmetic Reasoning 59 31 17 00 22 38 67 
4. Pattern Analysis 07 26 —02 33 23 48 47 
5. Mechanical Aptitude—2 14 18 00 27 64 16 56 

6. Biographical Inventory No. 1 01 —09 56 07 —07 —05 33 

7. Memory for Landmarks 25 25 16 39 12 32 42 

8. Background for Current Affairs 22 74 01 19 16 13 67 

9. Biographical Inventory No. 2 —01 15 53 22 10 16 39 
10. Arithmetic Reasoning 57 43 17 —02 26 31 70 
11. Aviation Information 13 71  —05 20 36 08 70 
12, Dial and Table Reading 49 32 19 38 19 25 62 
13. Reading Comprehension 29 69 05 10 28 23 70 
14. Electrical Information 06 53 04 18 59 19 70 
15. Mechanical Principles 02 10 02 07 38 47 38 
16. Numerical Operations No. 1 70 04 17 36 08 —18 69 
17. Numerical Operations No. 2 78 11 07 28 08 03 71 
18. General Mechanics 01 42 08 00 63 14 60 
19. Speed of Identification 08 18 08 54 26 16 43 

N v SE P ME Vz 


* Variables 1 through 4 are part scores of the AGCT. Variable 5 is also an AG test. Variables 6 through 19 


are the Airman Classification Test Battery. 


comprehension (V); III. Socioeconomic back- 
ground (SE); IV. Perceptual speed (P); V. 
Mechanical experience (ME); and VI. Visual- 
ization (Vz). 


Factor Loadings of Criteria 


The validity coefficients of the 19 tests were 
available for 14 technical-school final-grade 
criteria. They were obtained from Dailey (2). 
Table 2 lists the criteria, N for each, and the 
means and standard deviations for the groups 
on certain representative tests (a relatively 
pure test of each of the six factors isolated). 
The standard deviations were examined in 
order to obtain an estimate of the relative 
variabilities of the groups on each factor. The 
factor content of the training criteria was 
estimated by Mosier’s (9) extension of Dwyer’s 
(4) method for obtaining the factor content 
of a variable not included in the original analy- 


A description of these tests is contained in Dailey (2, Appendix A). 


sis. Table 3 gives the factor loadings and 
communalities of the criteria.? 

The airmen were sent to the technical- 
training schools on the basis of their perform- 
ance on the tests in the battery, and there is 
consequently considerable restriction of range 
in the tests, as well as on related tests for the 
selected groups. This restriction tends to 
reduce the validity coefficients and conse- 
quently the size of the factor loadings as 
compared with unselected groups. To esti- 
mate the size of the loadings for unrestricted 
samples, the values in Table 4 were calcu- 
lated. The scores on the Airman Classifica- 
tion Battery are scaled to yield a standard 
deviation of 2.0 for unrestricted samples, and 


2 To reduce printing costs Tables 2 and 3 have been 
deposited with the American Documentation Institute. 
Order Document 3599 from American Document Insti- 
tute, 1719 N Street, N.W., Washington 6, D. C 
remitting $1.00 for microfilm (images 1 inch high 0P 
standard 35 mm. motion picture film) or $1.00 for 
photocopies (6X8 inches) readable without optical aid. 


Ability Patterns in Technical. Training Criteria 


the A. G. tests are scaled to yield a standard 
deviation of 20.0 for similar groups. Using 
the standard deviations on the tests in Table 2 
as estimates of the amount of restriction of 
each criterion group on the factors, the load- 
ings for an unrestricted sample were estimated 
by the formula for correction for restriction- 
in-range, case 1 (6, p. 349), and are shown in 
Table 4. The loading of a criterion on a 
factor was considered the correlation between 
the criterion and the factor. An alternative 
and possibly more defensible procedure would 
have been to apply the corrections directly to 
the validity correlation coefficients. 

It will be observed that the criteria have 
a considerable range of communalities. The 
very low communalities (e.g., for carpenter, 
radio operator ACS, and engineman-operator) 
are probably due largely to the unreliability 
of these criteria, although no direct evidence 
is available on this point.* It would be de- 

3Tf the reliabilities should prove to be satisfactorily 
high then, of course, the indication would be that there 
is considerable specific variance in the criteria and that 


they are not sufficiently covered by the factors in the 
battery. 


383 


sirable to investigate the reliability of the 
grading systems used in these schools. 

It should be kept in mind when evaluating 
the loadings that the criteria are grades in a 
course of study rather than performance on 
the job. The two highest loadings for air- 
plane and engine mechanic (conventional) are 
on the mechanical-experience and visualization 
factors; for the jet mechanic on the mechanical- 
experience and verbal factors. The curricu- 
lum of the latter school might be examined 
to determine whether the stress on verbal 
material is necessary or even desirable. 

The aircraft sheet-metal worker grades are 
loaded highest on visualization with a smaller 
loading on mechanical experience, whereas 
draftsman grades have their highest loading 
on mechanical experience with a somewhat 
lower loading on visualization. These loadings 
might also serve as a basis for evaluating these 
two curricula, as more visualization variance 
might well be expected in drafting grades and 
more mechanical-experience variance in grades 
in a sheet-metal course. The high numerical 
loading for the clerk-typist criterion is also 


Table 4 


Estimated Factor Content of Technical Specialty Training Criteria Corrected for 


Restricted Variability 


Sample: Graduates of the ‘Technical Specialty Schools 
(Decimal points omitted throughout) 


Factor 
Criterion Group I Ir Il IV v VI le N* 
Weather Observer 45 36 22 29 32 13 58 122-253 
Radio Operator, ACS 22 29 23 24 —02 16 27 108-114 
Radio Operator, General 38 27 16 26 26 05 38 116-242 
A & E Mechanic (Conventional) 31 26 04 15 74 44 93 245-384. 
Aircraft Sheet-metal Worker 28 35 07 —01 39 53 64 136-243 
A & E Mechanic (Jet) 25 42 21 06 58 31 72 254-300 
Carpenter 17 12 —01 13 18 21 14 190-362 
Control Tower Operator 31 42 17 06 20 23 40 41-110 
Draftsman 26 08 20 16 54 32 53 91-209 
Electrician 57 38 17 —03 36 09 64 74-164 
Engineman Operator 16 08 12 06 43 16 26 69-144 
Clerk-Typist 71 36 06 21 —01 17 71 168-362 
Medical Corpsman 30 57 05 15 = 15 47 426-834 
Radar Mechanic, General 48 50 01 25 49 30 87 53-130 
N v SE P ME Vz 


* The sizes of the validation samples of the tests listed in Table 1 varied for each of the criterion groups within 


the limits shown, 


384 


somewhat surprising. A 


relatively 
verbal loading was expected. 


higher 


Discussion and Summary 


The present study is intended to be ex- 
ploratory and point out the possibilities in 
analyzing criteria as a method for better 
understanding the variance in technical- 
training course grades and other proficiency 
criteria. The steps in an ideal setup would 
be to: 

1. Administer a comprehensive classifica- 
tion test battery to a large and representative 
sample. 

2. Randomly assign members of the sample 
to courses of training so that the group taking 
each type of training would have an unre- 
stricted range of talent. 

3. Validate each test against a criterion of 
proficiency in the course of training. The 
establishment of reliable and valid criteria is in 
itself an important step. 

4, Intercorrelate and factor-analyze 
test battery. 

5. Estimate the factor loadings of the 
criteria. 

If desired, further steps might be to esti- 
mate the correlations among the criteria* and 
analyze the resulting correlations into clusters 
to form families of jobs requiring similar kinds 
and amounts of abilities. 

With a comprehensive test battery and 


the 


4 By means of the formula for reproducing the corre- 
lation between two variables from their orthogonal 
factor loadings, 712=41d2 + biba + +++ + mine. Usu- 
ally it is not possible to obtain the correlations between 
criteria directly, since a given group is rarely subjected 
to more than one course of training. 


Benjamin. Fruchter 


validities for large numbers of curricular and 
occupational criteria, it should be possible to 
put vocational guidance and occupational 
classification on a more objective basis. 


Received November 29, 1951. 


References 


1. Coombs, C. H., and Satter, G. A. A factorial 
approach to job families. Psychometrika, 1949, 
14, 33-42. 

Dailey, J. T. Development of the Airman Classi- 
fication Test Battery. Air Training Command 
Research Bulletin 48-4, November, 1948. 

. Dvorak, B. J. Differential occupational ability pat- 
terns. Minneapolis: Univ. of Minnesota Press, 
1935. 

4. Dwyer, P. S, The determination of the factor 
loadings of a given test from the known factor 
loadings of the other tests. Psychometrika, 1937, 
2, 173-178. 

5. Guilford, J. P. Factor analysis in a test-develop- 
ment program. Psychol. Rev., 1948, 55, 79-94. 

6. Guilford, J.P. Fundamental statistics in psychology 
and education. (Second edition.) New York: 
McGraw-Hill, 1950. 

7. Gulliksen, H. Intrinsic validity. 
gist, 1950, 5, 511-517. 

8. Lawshe, C. H., Dudek, E. E., and Wilson, R. F. 
Studies of evaluation. VII. A factor analysis of 
two point rating methods of job evaluation. J. 
appl. Psychol., 1948, 32, 118-129. 

9. Mosier, C.I. A note on Dwyer: The determination 
of the factor loadings of a given test. Psycho- 

metrika, 1938, 3, 297-299. 

. Shartle, C. L. Occupational information. New, 
York: Prentice-Hall, 1946. 

- 3309th Research and Development Squadron. Re- 
search report No. 10. San Antonio, Texas, Lack- 

land Air Force Base, January 1949. 

Viteles, M. S. Industrial psychology. New York: 
W. Norton, 1932. Act 
13. Zerga, J. E> Job analysis, a resumé and bibliog- 

raphy. J. appl. Psychol., 1943, 27, 249-267. 


i) 


w 


Amer. Psycholo- 


12. 


A Psychological Study of Occupational Adjustment * 


Alastair Heron 


Medical Research Council, London, England 


. 

This paper reports a study of male unskilled 
factory workers. The aim may be stated as 
follows: by means mainly of objective psycho- 
logical tests to study the relationships which 
may thereby be shown to exist between various 
aspects of the personalities of a group of un- 
skilled male factory workers, and the extent 
to which they appear to be meeting the 
demands of the job situation. 


Population Studied 


The population consisted of an intact section 


of a basic production department in a medium- 


sized factory. The 80 men concerned, vary- 
ing in age from 22 to 64 years, were unskilled 
operatives on individual piecework who had 
been in that department for periods ranging 
from 1 to 30 years, the median falling at 10 
years. As raw material accounted for nearly 
90 per cent of the cost of the product, the firm 
offered attractive wage and bonus rates. The 
job is one which over the years would select 
men who were in the main physically fit steady 
workers unlikely to be exceptional in many 
respects. It is the kind of job in which 
Russell Fraser (2) found the lowest incidence 
of neurosis. 

Study of the task—pouring molten lead into 
hand-operated moulds—suggests that only 
minimal intellectual equipment would be re- 
quired, and that some obsessionals might 
adapt well. Beyond that it was not 
meaningful to hypothesize in terms related to 
the demands of the job, except insofar as it 
is generally believed that emotional insta- 


bility is reflected in poor job adjustment. 


Personality Variables 


A battery of 22 individually administered 
objective tests was used, designed to cover 


. * The author is a member of the Unit for Research 
in Occupational Adaptation, which is based in the 
Maudsley Hospital, Denmark Hill, London, S.E. 5: 
This paper was presented in condensed form to the 10th 
International Congress of Psychotechnics at Gothen- 
burg, Sweden, in July 1951. Tt is a revision of part 
ee ee. thesis accepted by the University of London 


such aspects of personality as general mental 
ability, emotional stability, temperament, and 
dexterity. Full details of the tests, the re- 
sults obtained and the methods of statistical 
analysis will be found elsewhere (3). For 
the present purpose it will suffice to say that, 
with age partialled out, the matrix of product 
moment correlations between the tests was 
factor-analyzed, using Burts simple sum- 
mation method. The analysis yielded four 
significant’ factors accounting for 30 per cent 
of the variance. After five orthogonal rota- 
tions, three of these factors were readily 
identified in the light of previous data about 
the tests concerned. 

Factor scores were obtained by simple ad- 
dition of the individual’s normalized per- 
centile scores on the four tests having the 
highest loadings on the factor in question, in 
the final rotated solution. In this way no 
weight has been given to the actual loadings 
themselves. The factor analysis has in effect 
been used to select small groups of tests which 
have been shown to be related to one another 
in a psychologically meaningful way. Its 
arithmetic is then abandoned, and relation- 
ships between the “factor scores” and the 
criteria are expressed as zero-order coeffi- 
cients. 


Criteria of Occupational Adjustment 


Details of the establishment of these criteria 
will be found elsewhere (4). In view of their 
decisive importance, however, a description of 
each is considered necessary here. 

Productivity, This was an index based on 
the individual’s average productivity over a 
period of 67 weeks, as obtained from shop-floor 
records. It appeared to be so closely related 
to actual output (checked for 45 men over 26 
weeks on a particular type of casting) as to 
justify considerable confidence in its use. The 
principal criticism to which it should be sub- 
jected is that it may suffer from artificial 
limitation of range due to external variables. 
It is free from the familiar criticism in terms of 
the unknown effect upon production which 


385 


386 


can result from direct observation of the 
worker or the installation of special counting 
devices. As these men were working on an 
individual piecework basis, such an index 
seems to provide a measure of occupational 
effectiveness in terms of the individual’s 
“ability” to earn his living relative to that of 
others doing the same work. This does not, 
however, take into account the incentive 
differences arising from social, familial and 
economic pressures. Two men might easily 
possess the same hypothetical ability to 
carry out the working-operation at a high 
average rate, but one might show an actual 
rate much lower than the other because, for 
example, his wife was earning and he had no 
children to support. All but one of the men 
in the present study were married and a con- 
siderable proportion had children under school- 
leaving age. The latter variable was found 
to correlate with Productivity +.226 (P=.05), 
thus illustrating the point under discussion. 
It seems obvious that no collection of psycho- 
logical assessments could hope to account 
among them for more than a small propor- 
tion of the variance of this criterion. 

Job Adjustment. For this the author ob- 
tained a rating on a normally distributed 5- 
point scale, combined from the independent 
ratings of six supervisors on each of two oc- 
casions five weeks apart. The average con- 
sistency coefficient of the raters was .78, and 
careful statistical analysis established the 
unidimensionality of the criterion. It may be 
defined as a measure of the extent lo which a man 
is a source of concern to his supervisors. 

This criterion differs from a “merit rating” 
as generally used. The latter is most usually 
a means whereby men are considered for pro- 
motion in status, responsibility or earnings, 
and the rater’s judgments are deliberately 
oriented in such a way as to maximize the 
predictive power of the rating for these pur- 
poses. In the present situation, no such 
merit rating system was in existence; the 
procedure used was to obtain an assessment 
of the extent to which each man was a source 
of concern to the supervisor. Further, the 
role of the investigator as a member of a re- 
search unit was already well established, and 
his independence of management recognized 
to a very considerable extent. The influence 


Alastair Heron 


of “halo” resulting from familiarity was not, 
however, wholly absent; this could not readily 
be avoided when studying a well-established 
group whose experience in that department 
ranged from 1 to 30 years. In general, the 
criterion may perhaps be regarded with some 
justification as being satisfactory for the 
purpose. 


Results 


In view of the significant correlation with 
age of various tests in each factor score, its 
effects have been held constant by partial cor- 
relation when computing the relationship of 
each factor score with the Productivity 
criterion. When the Job Adjustment criterion 
is involved, both age and experience in the shop 
are held constant, as the latter variable cor- 
relates +.4 with this criterion. 

For the sake of clarity the results are pre- 
sented in summary form, and only those co- 
efficients are shown which are significant at 
the .05 level or better. Short titles of tests 
are provided in order to facilitate reference to 
details available elsewhere (3). 

1. Factor I. “General Mental Ability” 
(Dominoes Non-verbal + Letter Series + 
Paper Formboard + Vocabulary) showed no 
significant relationship with either criterion. 

2. Factor II. This factor is not named, but 
is suggestive of “Hysteric Tendency” or 
“Neurotic Extraversion” (low Hand Persist- 
ence + low Leg Persistence + many Food 
Aversions + high “neurotic” score on Word 
Connection List) and also showed no signifi- 
cant relationship with either criterion. 

3. Factor III. “Emotional Instability” 
(many Worries + much Static Ataxia + many 
Annoyances + many Interests) correlated with 
Poor Job Adjustment -+.45 (P = .001). 

4. Factor IV. “Speed of Approach” 
(Finger Dexterity + Quick Approach to Time 
Test + Manual Dexterity + Speed on Track 
Tracer) showed the following relationships: 
(a) with Productivity, +.25 (P = .05); and 
(b) with Good Job Adjustment, +.28 (P = 
01). 

5. An unweighted combination of Factor ur 
(reversed) and Factor IV, using the “pooling 
square” technique, gave a correlation with 
Good Job Adjustment of +-.53. 


(A 


Psychological Study of Occupational Adjustment 387 


Discussion 


It may be concluded formally that the null 
hypgthesis was confirmed in respect of general 
mental ability, as assessed by non-verbal, 
verbal-educational and visuo-spatial tests. 
In view of the low level of intellectual demand 
implicit in the actual job, and the number of 
men employed on it who were shown to be 
well above average in general mental ability, 
this finding is of some importance. If it were 
true in this instance, as suggested by Wyatt 
and Langdon (6), that such men would be 
handicapped by their excess of unneeded intel- 
lectual potential, this should have been, re- 
flected in a small negative relationship with 


one or more criteria of adjustment; no such re- . 


lationships were found. It may be, of course, 
that these men exhibit adaptation to other as- 
pects of the total situation which favor them 
in a way which they had not encountered on 
jobs apparently more in line with their in- 
tellectual potentialities. Conversely, it also 
seems of some importance that men of ex- 
tremely low mental ability—equivalent in 
several cases to that of high-grade mental 
defectives—appear able to do this job satis- 
factorily. This supports the conclusion reached 
by Tizard and O’Connor (5), when considering 
the employability of high-grade mental de- 
fectives, that monotonous work of a rela- 
tively simple kind may suit them well. 

Tf the hypothesis be advanced that “emo- 
tional instability” as evidenced by objective 
tests is related systematically to “poor job 
adjustment” as rated by supervisors, it is sus- 
tained by the findings. It is not sustained in 
respect of productivity. The four tests having 
the highest loading on this factor are all as- 
sociated with what has been described as the 
“dysthymic” rather than the “hysteric” end of 
the temperament continuum in neurotics (1). 
This includes the anxiety states, obsessional 
tendency and reactive depression. It seems 
that in the sample studied it is this heterogene- 
ous group of relatively unstable men who tend 
to be a source of concern to their supervisors. 
Their handicap is not, however, sufficient to 
affect their productive capacity over a long 
period. 

No hypothesis is required to provide a basis 
for the unsurprising finding that a tendency to 

Speed of Approach to a Task,” as measured 
by the Factor IV score, is related to Produc- 


tivity. If, as has been suggested, this may be 
regarded as a characteristic mode of person- 
ality, then for a man engaged on individual 
piecework its deficiency appears to constitute 
a handicap. At this point we may recollect 
the attenuating effects on such a relationship 
of the defects of the criterion; it may well be 
that a coefficient of .25 represents in this in- 
stance a more serious handicap than would ap- 
pear from its size. 

The combination of a high tendency to 
“emotional instability” of an anxious or de- 
pressive type, and a slow characteristic ap- 
proach to a task, is found to be related to 
supervisory assessment of job adjustment at a 
level which, without the aid of weights de- 
rived from multiple correlation, is high 
enough (.53) to be regarded as evidencing a 
genuine handicap in the situation studied. 


Summary 


1. This paper reports the relationships 
found between personality variables and 
occupational adjustment in a sample of 80 
male unskilled factory workers. . 

2. Personality variables were derived from a 
factor analysis of the intercorrelations between 
22 individually-administered objective tests. 

3. Two occupational criteria were specially 
prepared for the investigation; one was a meas- 
ure of productivity, the other of the extent to 
which men were a source of concern to their 
supervisors. 

4. Some significant relationships were found 
and are discussed briefly from various points 
of view. 


Received January 18, 1952, 


References 


1. Eysenck, H. J. Dimensions of personality. London: 
Kegan Paul, 1947. 

2. Fraser, R. The incidence of neurosis among factory 
workers. Rep. No. 90, Ind. Hith. Res. Bd. 
London: H. M. Stationery Office, 1947. 

3. Heron, A. The objective assessment of personality 
among factory workers. In press. 

4. Heron, A. The establishment for research purposes 
of two criteria of occupational adjustment. 
Occup. Psychol., 1952, 26, 2, 78-85. 

5. Tizard, J., and O’Connor, N. The employability of 
high-grade mental defectives, II. Amer. J. 
ment. Def., 1950, 55, 144-157. 

6. Wyatt, S., and Langdon, J.N. Fatigue and boredom 
in repetitive work. Rep. No. 77, Ind. Hlth. Res. 
Bd. London: H.M. Stationery Office, 1937. 


How Supervise? Scores Before and After Courses in Psychology 


Frederic R. Wickert 
Michigan State College 


How Supervise? is a device developed by 
File and Remmers for the measurement of 
attitudes and understandings necessary for 
supervisory success. Karn (2), in summa- 
rizing the rather inconclusive evidence for the 
validity of this questionnaire, states, “. . . 
Further studies appear to be in order before 
the question concerning the universal validity 
of the test can be settled.” 

Karn then reported an investigation of his 
own which he felt had a bearing on the validity 
of How Supervise?. He administered Forms 
A plus B of this test to 108 college students in 
their junior year, before and after a course in 
psychology. He also administered the two 
forms of this test to a comparable control 
group of 104 students before and after a course 
in English literature. Results showed that 
the two groups achieved the same mean 
scores on the test at the start of the psychology 
and English courses, respectively. When the 
test was readministered at the end of each 
course, however, the psychology (experimental) 
group had gained significantly, but the English 
(control) group had not. Karn further re- 
ports that Form B alone (as well as Forms A 
plus B) was sufficiently sensitive to show a 
significant difference for the psychology (ex- 
perimental) group on initial and final testings. 

The present study was designed, from one 
point of view, to supplement the Karn study 
by providing further evidence on the validity 
of How Supervise? From another point of 
view, if one assumes that How Supervise? 
measures the attitudes and understandings 
that enter into good human relations, the 
present study was concerned with discovering 
the degree to which such attitudes and under- 
standings had been taught in various types of 
psychology courses at the college level. From 
the first point of view, then, the present study 
was a test validation project; from the second 
point of view it was an educational research 
project. The reader may take his choice. 


Procedure 


Four groups of students at Michigan State 
College served as subjects.! Group 1 was 
made up of 68 students in the begining course 
in general psychology. The majority of 
these students were sophomores with @ 
sprinkling of freshmen as well as a few juniors 
and seniors. They came from many depart- 
ments of the College. The content of this 
one-quarter course was that traditionally 
covered in such courses: perception, emotion, 
learning, motivation, etc. No attempt was 
made to discuss psychological factors in human 
relations problems. 

The 77 students in Group 2 were also 
primarily sophomores. No freshmen were m- 
cluded, but there was a somewhat larger pro- 
portion of juniors and seniors than in the 
general psychology course. This second 
course was the beginning course in industrial 
psychology, and, as might have been expected, 
about half the students came from the School 
of Business and Public Service and the other 
half from the School of Science and Arts. The 
course in general psychology was a prerequisite 
for this course, so it represents the secon 
stage in a hierarchy of courses. About onè- 
fifth of the course was directly devoted tO 
teaching the psychology of human relations 
in industry; the remainder of the course was 
concerned with such traditional topics in 
industrial psychology as selection testing, 
merit rating, training, job analysis and job 
evaluation. 

The 51 students in Group 3 were about 
evenly divided between juniors and seniors: 
They were taking a course completely devoted 
to the study of psychological factors in huma” 
relations in industry. The prerequisite for 
this course was the beginning course in indus” 
trial psychology. Hence it was the th! 
course in the hierarchy. 


3 
1 Group 1 was 41% women, Group 2, 17%, Group ” 
15%, and Group 4, 29%. 


388 


How Supervise? Scores 389 


For Group 4, 31 students taking either or 
both of two advanced courses in industrial 
psychology were used. These students were 
all Seniors. These two courses, which had 
the course in the psychology of human rela- 
tions in industry as one of their prerequisites, 
were concerned with: (1) how to do various 
kinds of interviewing in industry and govern- 
ment situations; and (2) training and super- 
vising in industry and government. The 
courses depended on and referred to psycho- 
logical factors in human relations but did not 
specifically emphasize them or elaborate upon 
them. 

These four groups, then, represented each of 
four stages of a hierarchy of successively more 
advanced courses in psychology, the first of 
which had little or no human relations train- 
ing content, the second with some, the third 
almost completely, and the fourth relatively 
little except as an assumed foundation. The 
problem of this study was to work out the re- 
lationship between training in human rela- 
tions and performance on How Supervise?. 
Fortunately, Karn, in the study already 
referred to, had demonstrated that a control 
group, under circumstances roughly compa- 
rable to those of the present study, showed no 
appreciable gain between initial and terminal 
testing on How Supervise?. Accordingly, no 
series of control groups matched with each of 
the four experimental groups were utilized in 
the present study. It was assumed, perhaps 
not with complete justification, that such 
control groups would have shown no gains 
just as Karn’s control group did not. 

The details of administration of the test are 
as follows, In each of the four groups during 
the first week of the same college quarter, 
Form A was given to one half the group and 
Form B to the other half. During the last 
week of the quarter, Form B was given to 
those who were initially tested on Form A, 
and Form A was given to those initially tested 
on Form B. This reversing of forms was 
done since File said, “Forms A and B are 
closely equivalent.” He also added, “ 
Norms of ‘One Form Only’ may be applied to 
either Form A or Form B.” Furthermore 
he suggested using the different forms if “re- 
peated testing is anticipated” (1). 


Results ar 


Since Form A was given first to half of each 
group and was followed by Form B, and the 
opposite procedure was followed for the other 
half of each group, it was possible to check 
the comparability of the two forms in detect- 
ing the effects of training. Table 1 shows this 
comparison. It will be seen that the means 
of both Forms A and B, when given first, were 
practically identical. It will also be seen 
that when Form A was followed by Form B, 
a significant gain occurred. However, when 
Form B was followed by Form A, some gain, 
but not a statistically significant one, re- 
sulted. This finding suggested that the two 
forms, though identical on first administration, 
were not equally sensitive to picking up the 
effects of the kind of human relations training 
involved in this study. Notwithstanding this 
finding, it was decided to include in the analysis 
of results all the cases, that is, to include those 
who took Form B first followed by Form A as 
well as those who took Form A first followed 
by Form B. 

The principal results are shown in Table 2. 
This table compares initial and terminal scores 
on How Supervise? for each of four successively 
more highly trained groups. The table indi- 
cates that there was some gain in all four 


groups. Group 3, which received the most 


Table 1 


Comparison between Forms A and B of How Supervise? 
When Used in Initial and Terminal Testing, 
All Four Groups Combined 


Form B Given 
First and First and 
Form B Second Form A Second 


N 114 113 
Mean of form 

given first 48.07 (Form A) 48.11 (Form B) 
S.D. 9.62 8.88 
Mean of form 

given second 53.00 (Form B) 49.16 (Form A) 
S.D. 7.58 7.90 
Difference be- 

tween means 4.93 (Gain) 1.05 (Gain) 
ag 4.98* 1.01 


Form A Given 


* Significant at 1% level. (A “t? of 2.6 is needed 
for significance at 1% level and 1.99 at 5% level.) 


390 


Frederic R. Wickert 


Table 2 


Comparison of Initial and Terminal Scores on How Supervise? for Four Successively 
More Highly Trained Groups 


Before vs. After vs. 
After Next Before 
Group N Mean S.D. e gi 
1. 68 Before 4474 9.58 j 
2 \Genea Psychology Aiter 46.26 765 1.01 (gain) 
: .03 (loss) 
3. 77 Before 46.22 9.10 4 
S i : 7 2.44* (ga 
3} ophomore Industrial Psychology Mie 49.48 730 44* (gain) 
2.31" (gain) 
5. | Psychology of Human Relations 51 Before 52.39 6.66 3.58** (gai 
i = = 3.58** (gain) 
6.) in Industry After 56.67 5.21 
2.32* (loss) 
7 i à 31 Before 53.00 7.74 à 
Sı Ind a 
a enior Industrial Psychology ‘After 56.13 5.44 1.81 (gain) 


* Significant at 5% level. 
** Significant at 1% level. 


direct training in human relations principles, 
showed the greatest gain (significant at the 
1% level). Group 2, which received some 
human relations training, showed the next 
greatest gain (significant at the 5% level). 
Group 4, which received additional human 
relations training indirectly only, showed con- 
siderable gain but the N was too small to show 
statistical significance. Group 1, the training 
for which included only occasional and then 
only superficial references to human relations 
principles, showed the least gain (not statis- 
tically significant). 

The results in Table 2 also suggest the 
operation of certain factors other than train- 
ing in human relations. At the time this 
project was carried out, there were, during 
each of the 3 quarters of the regular academic 
year, about 20-22 sections of general psy- 
chology of about 80 students each, 5-6 sec- 
tions of sophomore industrial psychology of 


2 An explanation of the appreciable gain in the test 
scores of Karn’s general psychology group before and 
after training and the negligible gain of the general 
psychology group at Michigan State College deserves 
to be mentioned. Karn states regarding the Carnegie 
Institute of Technology course, “the emphasis . + was 

. upon the application of psychological principles to 
adjustment problems, particularly those likely to be 
encountered in human relations situations in industry,” 
while the content of the course at Michigan State 
College, as was mentioned earlier in this paper, was 
more orthodox. 


about 80 students each, and 2 sections of the 
psychology of human relations in industry of 
from 25 to 30 students each? Two of the 
three quarters, there were generally 2 different 
classes in senior industrial psychology of 20 
students each. This means that only perhaps 
a fourth of the students in general psychology 
went on to sophomore industrial psychology: 
For many students, these two courses were 
required. Group 3, however, was more highly 
self-selected; the course in the psychology of 
human relations was not required so that 
it was usually limited to highly interested 
Students. Generally, students took the 
courses represented by Groups 1, 2, and 
closely following one another in time. There 
was usually a considerable lapse of time, often 
3 tos quarters, before taking senior indus- 
trial psychology. > 
With the above administrative factors I" 
mind, it is not difficult to interpret certain © 
the results in Table 2. Students in sophomor? 
industrial psychology entered that course ät 
just about the level of training in human 1°- 
lations they had attained in their general PSY" 
chology course. They did not appear to va 


*It should be pointed out that the one section g 
general psychology and one section of sophomore indu f 
trial psychology were assumed to be representativ? sg 
the total group. Groups 3 and 4, however, inclu! 
all the students in the courses covered in. this study- 


— 


4 


How Supervise? Scores 391 


very much different in their knowledge of 
human relations principles from the mass of 
students finishing the one beginning course in 
general psychology. The relatively small 
group of students who went on to the psy- 
chology of human relations in industry, how- 
ever, started significantly above the mean (as 
shown by the entry in the column headed 
“after vs. next before ‘t’ ”). The fact that 
the still smaller group (Group 4) in the senior 
industrial psychology courses started and 
ended at about the same level as Group 3 is 
not easy to interpret. It is possible that 
generally the poorer students from the psy- 
chology of human relations classes went on 
into the senior industrial psychology. It is 
also possible that these seniors might have 
forgotten some information in the 3 to 5- 
quarter time gap between their having taken 
the psychology of human relations course and 
the senior industrial psychology course. A 
further possibility is that these students were 
scoring so close to the upper possible scores 
(about the 95th percentile on Files’ “lower 
level supervisor” norms and about the 70th 
percentile on the “higher level supervisor” 
norms) that it was difficult for a group aver- 
age to stay consistently high. 


Discussion 


The results of this study showed first that 
the two forms of the File-Remmers question- 
naire might well be equivalent before a specific 
course of training in human relations principles 
but were not equivalent in their sensitivity to 
changes brought about by such a course of 
One form (Form B) proved to be 


training. 
quite sensitive but the other form (Form A) 
was not, Some further analyses of the kinds 


of items in the two forms, of the content of 
the training courses, and of the data in this 
project (not reported here) failed to disclose 
any very good reasons for this finding. 

Notwithstanding the differential sensitivity 
of the two forms, the File-Remmers test was 
found to have considerable validity in the 
sense that gains in the scores before and after 
a course of training corresponded closely to 
the known content of the training courses. 
The greater the human relations content, the 
greater was the gain. 


Tf on the other hand the File-Remmers test 
is assumed to have validity, it could then be 
stated that the training was having a signif- 
icant effect in inculcating the attitudes and 
understandings underlying good human re- 
lations. 

This study, then, supplements that of Karn. 
In general, it shows successive gains in knowl- 
edge of human relations principles through a 
series of successively more advanced courses in 
psychology. It also, to use Karn’s words, is 
“a contribution towards the establishment of 
the universal validity of the test since it 
deals . . . with college students under aca- 
demic instruction. Previous studies, which 
have shown positive gains with training, have 
been concerned with the effects of specific 
industrial training programs among on-the-job 
supervisory personnel. The demonstration 
of improved scores under a variety of training 
conditions indicates that the responses are a 
result of the application of basic principles to 
the problems rather then the acquisition of 
specific answers to the questionnaire items.” 


Summary 


Four groups of college students, each group 
at a successively more advanced level of train- 
ing with respect to knowledge of human rela- 
tions principles, were given the File-Remmers 
test, How Supervise?, before and after each of 
four successively more advanced courses in 
psychology. Control groups were not used 
since an earlier study by Karn, under roughly 
comparable circumstances, had already shown 
that while an experimental group of college 
students gained in knowledge of human re- 
lations principles, a control group did not gain 
appreciably. It was then assumed, perhaps 
not with complete justification, that if control 
groups matched for each experimental group 
in the present study had been utilized, they 
would not have shown the gains made by the 
experimental groups. 

Results showed that the two forms of the 
test were closely equivalent before training, 
but Form A following Form B was consider- 
ably less sensitive in detecting the effects of 
human relations training than Form B follow- 
ing Form A. 


392 Frederic R. Wickert 


Results also showed that gains in mean 
scores for the four groups corresponded closely 
to the amount of human relations training be- 
lieved to have been included in each of the four 
courses. Moreover, in general the mean score 
level of each successively more advanced group 
at the beginning of training corresponded 
closely to the level of human relations knowl- 
edge that each group could have been expected 
to have by virtue of the degree of human re- 
lations training it had received in the past. 
These results may be interpreted in either one, 
but not both, of two ways. If it is accepted 
that the courses in psychology were inculcating 
the attitudes and understandings that enter 
into good human relations, then the results 
may be taken to show some validity for How 
Supervise? as a test of such attitudes and 


understandings. On the other hand, if How 
Supervise? is accepted as a test of the atti- 
tudes and understandings that enter into 
good human relations, then the results may be 
taken to show that the psychology courses, 
particularly the ones designed to do so, were 
in fact successfully inculcating such attitudes 
and understandings. 


Received January 8, 1952. 


References 


1. File, Q. W., and Remmers, H. H. How Supervise? 
(revised manual). New York: 1948, Psychologi- 
cal Corporation, pp. 8. 

2. Karn, H. W. Performance on the File-Remmers 
Test, How Supervise?, before and after a course 
in psychology. J. appl. Psychol., 1949, 33, 534- 
539. 


— % 


Measurement of Supervisory Ability 


Gerald C. Carter 


* University of Illinois 


A battery of tests which consisted of The 
Personal Audit; How Supervise? Form A and 
B; Mechanical Comprehension Test Form AA; 
Kuder Preference Record; and the Otis Quick 
Scoring Mental Ability Test was given to the 
foremen and assistant foremen in two metal 
fabricating plants. Ratings by fellow super- 
visors were used as criteria of supervisory 
ability. Group A was composed of 15 foremen 
in the larger of the two plants. Group B was 
composed of 16 assistant foremen in the same 
plant and Group C was composed of 17 fore- 
men in the smaller of the two plants. Some 
of the supervisors did not take all of the tests. 
Therefore, the N varied somewhat from test to 


test. 
Criterion 


The number of fellow supervisors who did 
the rating in the larger plant was 27. The 
number of fellow supervisors who did the 
rating in the smaller plant was 15. 

The ratings were made by an open middle 
forced distribution technique. Each rater 
was given a sheet with the following direc- 
tions: “The results of this rating will be kept 
confidential. You are a plant superintendent 

iven the task of selecting five supervisors from 
the following list of names. Write the names 
of the five men you would choose first. Write 
the names of the five men you would consider 
last. Omit your own name. Try not to let 
personal feelings affect your decision. Do not 


sign this paper.’ 

The names 0: 
groups were placed on sê 
score for each supervisor Wa 
signing three points for each time he was men- 
tioned in the first category, 1 point for each 
time he was mentioned in the last category, 
and 2 points for each time he was automatically 
placed in the open middle. The total of these 
points was then divided by the number of 
raters for that particular group and multi- 
plied by 10 to remove the decimal point. 

The uncorrected split-half reliabilities of 


) 
f the supervisors in the three 
parate sheets. A 
s obtained by as- 


the ratings were computed and found to range 
from .75 for group B to .82 for group A. The 
reliability was .85 for the three groups com- 
bined. The criterion is more reliable than is 
usually found because it is a “pooled rating.” 


Results 


The coefficients of correlation between the 
criterion and scores on the Mechanical Compre- 
hension Test, Olis Quick Scoring Mental 
Ability Test, scores on each part of The 
Personal Audit, scores on each part of the 
Kuder Preference Record and scores on each 
part and total scores for Form A and B sepa- 
rately and combined for How Supervise? are 
shown in Table 1. 

The coefficients of correlation shown in 
Table 1 included all supervisors who had taken 
each test. Therefore, these coefficients differ 
somewhat from those shown in Table 2 be- 
cause these were recomputed to include only 
those supervisors who had taken all of the 
three best predictors for each group. If a 
supervisor had failed to take one of the tests 
used to obtain the multiple correlation for a 
group, his score had to be removed from cor- 
relation of the other two tests so that identical 
sets of individuals were used for each of these 
three tests. 

In order to determine the extent to which 
the three best predictors measured supervisory 
success as indicated by the ratings, coefficients 
of multiple correlation and their standard 
errors were computed for each group and for 
the three groups combined. Regression co- 
efficients and the regression equation were 
established for the three groups combined. 

The regression coefficients show relative 
weights of each of these three measures in 
contributing to the supervisory ability as 
measured by the pooled ratings of fellow 
supervisors. The regression equation may be 
used to predict the rating of any particular 
supervisor from the knowledge of his scores on 
each of the three best predictors. For in- 


393 


394 


stance, if a prediction of the rating is to be 
made for Mr. X, his score of 21 on part three 
of How Supervise? Form A, his score of 31 on 
the artistic section of the Kuder Preference 
Record, his score of 32 on the Mechanical 
Comprehension Test are multiplied by the 
relative weights of these three factors. The 
sum of these products added to the constant 
gives the predicted rating. The equation for 
the predicted rating (P) for Mr. X is: 


P = .244 X 21 + .023 X 31 + .233 X 32 + 
5.107 = 18.4. 


It can be predicted that Mr. X should have 
a rating of 18.4. An examination of the actual 


Gerald C. Carter 


rating indicates that he received a pooled 
rating of 17 by his fellow supervisors. The 
coefficient of multiple correlation indicates 
the degree of accuracy of the prediction of the 
rating from the regression equation. These 
coefficients were .74 for group A, .80 for 
group B, .80 for group C and .84 for the three 
groups combined. 

The coefficient of multiple correlation for 
the three groups combined (.84) being quite 
high might give the erroneous impression that 
the predictors used to obtain it are highly 
effective for each of the three groups. How- 
ever, this is not true. Two of these predictors 
are ineffective for group A. 


Table 1 


Coefficients of Correlation between Criterion and Test Scores 


Group A 


Group B Group C Total Group 
N r N r N r N r 
Personal Audit 
Seriousness 13. =.63 15, (a6 14 09 42 —.20 
Firmness 13 10 15 13 EAN 43 .06 
Tranquillity 13 —.42 15 —09 15 —.03 43 —.16 
Frankness 13 12 15 10 (oe i 43 05 
Stability 13 -—,06 15) ° —.09 15” 7.08 a3. = 108 
Tolerance 12 13 15 52 14 .03 41 18 
Kuder Preference Record 
Mechanical 14 04 15 04 15 15 44 11 
Computational 14 —.07 15 .03 15 —.24 44 Et) 
Scientific 14 13 15 09 15 2 44 AS 
Persuasive 14 —.40 15 04 15 Sg 44 BEG 
Artistic 1464 banas 15 12 44 32 
Literary 14-05 {50 ne sail aa E 
Musical 14 -04 15 —.10 15 —44 44 —.16 
Social Service 14 =28 15 =.99 15 45 44 —.09 
Clerical 14 ~.16 15d if ay yo wl 
How Supervise? 
Form A Part I R E 13 16 17 15 43 02 
Part IT i 08 B —06 17 —06 4 = 08 
Part II 13 —.04 136 ive) ite ag 5 A 
ata 13 —.06 1337 17st PERE” 
Form B Part I 133 —.24 15 .03 17 48 45 14 
Part II i =.02 15 61 17, 43 45 30 
Part II 13 —.04 15 46 17 35 45 3 
Total i3 102 15 53 17 55 45 3t 
Form A&B Part I uo =i 12 23 17 39 40 jt 
Part II uo 05 2 k 97 i 
Part II vie id ee 7 130 40 0 
Total ite 03 12-37 17, 58 i 
Otis Test 4 20 5O D 16 19 45 R 
Mech. Compr., AA 13 —.24 16 15 15 39 44 2 


$ 


Measurement of Supervisory Ability 395 


Table 2 
Uncorrected Number of 
Split Half Se (Castine eRe 
° Reliability Three Best Predictors with Coefficients of Taking These of Multiple M o 
of Criterion Correlation with Criterion Three Tests Correlation Garaladen 
Group A -82 Kuder Preference Record, Artistic (r=-66) 12 74 13 
Personal Audit, Seriousness (r= —.58) 
Personal Audit, Tranquillity (r= —.40) 
Group B 75 How Supervise?, Part III, Form A-+Form B (r=.66) 12 .80 10 
Personal Audit, Tolerance (r=.58) 
Kuder Preference Record, Social Service (r= —.37) 
Group C .80 How Supervise?, Part II, Form A (r=.65) 15 .80 09 
Kuder Preference Record, Clerical (r= —.47) 
Kuder Preference Record, Musical (r= —.44) 
Total 85 How Supervise?, Part IM, Form A (r=.63) 36 84 05 


Kuder Preference Record, Artistic (r=.13) 
Bennett Mechanical Comprehension (r=.62) 


Although all of the coefficients of multiple 
correlations were surprisingly high, being in 
the order of the magnitude of the reliabilities 
of the criteria and of the tests, this should not 
be construed as evidence that any of the sets 
of three best predictors would necessarily be 
effective for predicting supervisory ability in 
general. The effectiveness of the predictors 
varied greatly from group to group. This is 
very probably due to the unstable zero order 
correlations and the tendency of multiple cor- 
relation to unduly magnify chance errors of 
measurement. Because of this variation a 


cross validation study should be made for 
every supervisory group for which predictors 
are desired. 

The results presented indicate that, at least 
in some instances, supervisory ability can be 
measured fairly accurately by psychological 
tests by selecting the best predictors from a 
relatively large number of likely indices (29 in 
this study). It is also necessary to have re- 
liable criteria such as pooled ratings by a 
large number of raters who know the super- 
visors being rated exceedingly well. 


Received February 13, 1952. 


Studies of Group Behavior: Factors Associated with the 
Productivity of Groups * 


John G. Darley 


University of Minnesota 


Neal Gross? 


Harvard University 


and 


William C. Martin? 


University of Illinois 


This is one of a series of papers reporting 
the results of a study of the relations among 
selected sociological and psychological vari- 
ables in the behavior of small, organized 
groups. The present phase of the total study 
deals with the analysis of factors related to 
group productivity. The groups studied com- 
prised thirteen women’s small residence units 
organized as a cooperative housing project 
under the auspices of the University of Minne- 
sota. Background data on these small groups 
appear in items (1) and (2) of the References. 


The Criterion of Productivity 


During the course of the investigation, an 
attempt was made to get an estimate of the 
relative productivity of the groups when en- 
gaged in a common and meaningful task as a 
criterion measure against which to correlate 
certain group variables. 

While the cooperative housing project 
served primarily to provide room and board 
for the students at a lower cost than was else- 
where available to them, the University 
sought, by giving it full recognition as a stu- 
dent activity, to make the residence experience 
a worth-while phase of personal growth and 


1 The studies to be reported in this and subsequent 
articles, under the general title of Studies of Group 
Behavior, have been carried on by the Laboratory for 
Research in Social Relations at the University of Min- 
nesota under grants from the Graduate School; the 
College of Science, Literature, and the Arts; and from 
the Carnegie Corporation. They are part of a larger 
research program dealing with the broad problems of 
social responsibility. i 

2 Members of the Minnesota Laboratory at the time 
of the research. 


development for them. Problems of co- 
operative living were stressed in their broader 
implications for society generally. Against 
this background of a consciously-shared ex- 
perience, the experimenters proposed that 
each house prepare a “plan for better C0- 
operative living in the Village.” Each house 
was allowed one month during the spring 
term for the preparation of its report, as 4 
house project and in conformity to a suggested 
standard outline. Cash prizes of $100, $75, 
$50, $35, and $25 were announced for the five 
top-ranking houses in the contest. Faculty 
members who agreed to serve as judges were 
asked to make an over-all ranking of the 
thirteen reports in the form of a single total 
judgment, and the criteria to be used by the 
judges were also given to the students at the 
start of the contest. 


The Independent Variables 


The predictor variables in such a study 
must bear some explicit or implicit relation t° 
tentative hypotheses regarding factors ass0- 
ciated with group productivity. As a first 
approximation, the problem was formulated 
in the following manner: 

All other things being equal, the produc- 
tivity of a group will depend upon the extent 
to which: A. the group goal is accepted; B- 
the previous group experience has been suc- 
cessful and satisfying; C. appropriate use is 
made of differential skills within the grouP 
membership; D. leadership is both accepté 
and persistent; and E. there are available 


396 


rA 


Studies of Group Behavior 


within the group individuals of requisite 
ability or special skill. 

Before describing the independent variables 
assumed to measure the behavior needed to 
test the above hypotheses, it is necessary to 
describe briefly one additional source of data, 
the participant-observer reports. The Uni- 
versity-assigned counselor in each house was 
asked to play the role of participant-observer 
during the month in which the reports were 
prepared. The tasks of the participant-ob- 
server were as follows: to prepare a narrative 
account of each house meeting held concerning 
the contest; to report on the attendance at 
each such meeting; to rate every participant 
at these meetings on a five-step scale for 
specified behavior. 

The narrative accounts of the meetings were 
designed to cover: enthusiasm for the task to 
be accomplished by the group; strength and 
sources of leadership in the task; type of leader- 
ship (democratic vs. authoritarian); evidences 
of conflict within the group; and. organiza- 
tional efficiency with which the group carried 
out the task. The participant-observers rated 
each student present at each meeting on a 
five-step scale for: extent of participation in 
the meeting; usefulness of suggestions; CO- 
operation; willingness to assume responsibility ; 
and leadership. 

The narrative reports submitted by the 
participant-observers were later independently 
analyzed by two raters, who used simple five- 
step scales (with step 1 being arbitrarily as- 
signed as the “good” score) to assess: strength 
of leadership; type of leadership (“autocratic 
vs. democratic”); efficiency of organization; 
evidences of interpersonal conflict; enthusiasm; 
and amount of leadership provided by the 
counselor (who was also the participant-ob- 
server). These assessments, based on all 
narrative reports for each house, represented 
composite analyses of house behavior during 
the entire month of the contest; final house 
ranks were assigned by averaging the ranks 
assigned by the two raters for a particular 
variable for each meeting held by the group 
during the month of the contest. 

_ Returning now to the theoretical formula- 
tion of the problem, it is in order to list the 
kinds of data derived as measures of the inde- 
pendent variables related to productivity. 


397 


These are listed herewith, under the five 
general behavior categories:* 


A. Acceptance of group goal: 


1. Volume of participation (attendance 

records) 

Extent of participation (as rated by 

participant-observers) 

3. Assumption of responsibility (as rated 
by participant-observers) 

4. Enthusiasm for the task (as judged by 
case readers) 


N 


B. Success or satisfaction with previous group 
experiences: 


5. Satisfaction with Village life (from auto- 
biographical reports) 

6. House dislike ratio (from end-of-year 
sociometric schedules) 

7. House ratio of in-group to out-group 
choices for “social support” (from start- 
of-year sociometric schedules) 

8. House ratio of in-group to out-group 
choices for ‘‘confidantes”’ (from start-of- 
year sociometric schedules) 

9. House ratio of in-group to out-group 
choices for “intimate friends” (from end- 
of-year sociometric schedules) 

10. Evidence of personality conflict (as 
judged by case readers) 


C and E. Availability of essential skills and 
appropriate use of such skills: 
(These two dimensions are treated to- 
gether in the following list.) 


11. House scholastic achievement (house 
end-of-year grade average) 

12. Efficiency of house organization for the 
contest (as judged by case readers) 

13. Usefulness of suggestions (as rated by 
participant-observers) a 

14. Contribution to the contest (from end- 
of-year sociometric schedules) 


D. Patterns of leadership: 


15. Strength of leadership (as judged by case 
readers) 

16. Type of leadership (as judged by case 
readers) 

17. The leader role of the counselor (as 
judged by case readers) n 

18. Support of formal leader (from end-of- 
year sociometric schedule) 


3 Definitions of these 18 variables and Table 4 con- 
taining their house means and ranks have been filed 
with the American Documentation Institute. Order 
Document 3616 from American Documentation Insti- 
tute, 1719 N Street, N.W., Washington 6, D. C., re- 
mitting $1.00 for microfilm (images 1 inch high on 
standard 35 mm. motion picture film) or $1.20 for 
photocopies (6X8 inches) readable without optical aid. 
‘A longer manuscript of the report also is available for 
reference from the senior author. 


398 


These variables range all the way from 
clearly objective data (school grades and at- 
tendance data for house meetings) to more or 
less tenuous and possibly unstable measures of 
interpersonal relations from sociometric sched- 
ules. The interactions in each house are, 
furthermore, seen through the eyes of three 
different sets of observers: the students them- 
selves who were participating in the contest; 
the participant-observers, who, with minimal 
training, were recording house progress during 
the contest as well as participating; and the 
two judges who were the only ones to have a 
view of all participant-observer reports, for 
purposes of comparing the houses on defined 
behavior. 

Because the basic N for the sample of houses 
is thirteen, rank-order correlations must be 
used as the approximate estimate of relation- 
ships. Implicit in this procedure is the as- 
sumption that the behavior of a particular 
house can be indexed by averaging behavior 
measures for its membership of seven to six- 
teen students taken singly, i.e., that the group 
is merely the average of its component parts, 
disregarding the variability of the component 
members around the group average. An 
alternative assumption, and one equally crucial 
to test, is that the group’s behavior in certain 
areas is best indexed by minimum variability 
around a known average position, as would be 
the case, for example, in estimating the pos- 
sibility of change in attitudes for a group show- 
ing a strong original attitude in opposition to 
prevailing authority, accompanied by a small 
standard deviation around the mean opposi- 
tion score. 

A further assumption, more directly amen- 
able to testing in this study, concerns the 
varieties of behavior defined by the eighteen 
independent variables. These were defined 

and grouped on essentially logical grounds 
at various stages of the year’s field study. As 
a first approximation, therefore, the behaviors 
so defined and grouped should show relatively 
high patterns of intercorrelations within a 
group and relatively low correlations with be- 
haviors from other groups, if our formulations 
are to have meaning. More specifically, any 
two independent variables which appear to be 
closely related logically and operationally 


Darley, Gross, and Martin 


might be expected to show a sufficiently high 
intercorrelation to be interpreted as a meas- 
ure of the reliability of two assessment 
methods. Where such expectations fail to 
materialize, either the variables observed 
derive from different domains of behavior, in 
spite of the a priori logic by which they were 
grouped, or the methods of measuring the 
variables are insufficiently precise, or the 
variables, while relatively unrelated, may 
still tap effectively component parts of the 
criterion measure and might show a substantial 
loading of a more general factor underlying 
the total matrix. 


Results 


The Criterion. Table 1 lists the rankings 
assigned by each of five faculty judges to the 
final reports prepared by each of the thirteen 
houses. Judges 3 and 4 appear to deviate 
most markedly from the remaining three 
judges in their evaluations. However, on the 
basis of the average ranking of the five judges, 
Houses M, E, D, K, and G, respectively, were 
awarded the announced cash prizes as soon aS 
possible after the judges had made their 
independent reports to the research team and 


before the intercorrelations of judges had been 
determined. 


Table 1 


Ranks Assigned by Each Judge to Essays Produced 
by Specified House Unit* 


Judge 
House 1 2 2 4 5 
A BOs GY) 6 9 9.5 
B 25% oo 1 Saleen 
c 2 ü 2 6 Pe 
D eee 1 Ae Oey 
E 4. 11S 5 2 45 
F 13 13 13 13 13.0 
G 5 2 4 it 2.0 
H 10 8 11 10 12.0 
I 6 6 7 3 8.0 
J 7 7 12 8 6.5 
K 3 3 8 7 3.0 
L oS” Ad O TES 9.8 
M 1 1 3 1 1.0 


* The five cash prizes were aw o ; 
varded to Houses 1 
D, K, and G, res pectively, 


—— e 


Studies of Group Behavior 399 


Table 2, showing the rank-order intercor- 
relations between all pairs of judges, indicates 
more clearly the degrees of agreement. The 
avetage intercorrelation of Table 2, based on 
five judges, is .59. Averaging the intercor- 
relations of Judges 1, 2, and 5, the resultant 
value is .85. Because these three judges 
showed a greater comparability and con- 
sistency in their evaluations, their rankings 
only have been used in the more detailed analy- 
ses of the factors associated with the produc- 
tivity of the groups. 

Another aspect of the criterion should be 
mentioned briefly. While the residents of the 
Village participated quite willingly in the task 
imposed by the experimenters, even to the 
extent of requesting that their detailed sug- 
gestions for improvement be sent anonymously 
at the end of the contest to the University 
administration, the task toward which they 
directed their energies involved essentially a 
verbal formulation and expression of situations 
they had self-consciously been experiencing 
and discussing throughout the year. The 
better performers in preparing such a report 
might therefore be the “hetter” group, on any 
one of a number of scales of “goodness” as a 
well-satisfied, well-organized residence unit. 
Had the members of each house been faced 
with a task unrelated to their experiences of 
living together, the resultant criterion meas- 
ure might show an entirely different set of 
relations to the same independent variables or 
the maintenance of the same relative weights 
at a much lower level of absolute predictive 
power. je> 

These reservations about the criterion can 
be viewed in at least two ways. First, one 
might argue that the results of the study can- 
not be generalized—that group forces relating 
to productivity cannot be presumed to operate 
independently of the content of the task 
toward which the group is oriented as the 
Gutenonemeasure’ Alternatively, one might 
argue that the predictor variables will retain 
their relative weights for any dependent 
(criterion) variable of approximately the same 
degree of relevance to the group. Relevance, 
in this sense, becomes the crucial property or 
quality of the criterion task to be perceived by 
the participants and transcends the task itself 


Table 2 


Rank-Order Intercorrelations between Pairs of Judges 
in Evaluating House Essays 


Judge 
Judg 2 3 4 5 
qa 78 -23 61 92 
2 AT 52 85 
3 59 .42 
4 200 


Average rho, 5 judges = 59, Average rho, Judges 1, 
2, and 5 = .85. 


to a degree that will permit generalization. 
The property of relevance might be com- 
pounded of no less than the following elements: 
the group’s perception of the importance of 
the task; the group’s desire to accomplish the 
task; the group’s confidence that its members 
know how to do the task. 

The Predictor Variables. All house averages 
on the criterion variable, and also on the 
eighteen predictor variables previously de- 
scribed, were obtained and used in arriving at 
the rank-order correlations set forth in Tables 
3 and 4.4 By reference to the list of variables 
given earlier, it is possible to study the cate- 
gories of predictor variables and to make some 
observations regarding the consistency with 
which certain behaviors were observed in this 
study. 

Variables 4, 10, 12, 15, 16, and 17 in Table 3 
resulted from the average ratings of two in- 
dependent judges, who read all the participant- 
observer reports and rated them for the 
variables involved. The degree of agreement 
between these two judges in assigning house 
ratings to these reports is shown in Table 5. 

In assessing strength of leadership, efficiency 
of organization for the task, and degree of 
enthusiasm of the participants, the judges are 
in substantial agreement, although they show 
less agreement for the other three variables. 

Variables 2, 3, and 13 are derived from. 
ratings made after each meeting by the partici- 
pant-observers, who rated each girl in attend- 
ance, including themselves, on a five-step scale 


1 Table 4, giving house averages, and resultant ranks, 


has been filed with the American Documentation Insti- 
tute, See footnote 3. 


Table 3 
Rank-Order Intercorrelations among 18 Independent Variables and the Dependent Variable of Rated Productivity 
Variables 
Produc- 
Variables tivity 1 2 3 4 S ó 7 8 9 10 11 12 13 14 15 16 17 
1. Volume of participation .02 i 
2. Extent of participation 21 26 
3. Assumption of responsibility 25 —.21 74 
4. Enthusiasm for the task 59 43 66 51 >] 
5. Satisfaction with Village life -64 30 —.03 —.09 .27 S 
6. Dislike ratio 38 —33 17 16 38 16 © 
7. In-group/out-group: “social T 
support” 50 =35 29 is 19 —14 37 2 
8. In-group/out-group: “confi- A 
dantes” .19 -10 16 19 ll 49 04 —.03 Ss 
9, In-group/out-group: “inti- = 
mate friends” 27 08 —45 —45 —.27 57 —.37 0 34 i 
; . Aa 
10. Evidence of personality Q 
conflict 25 —.04 at 17 52 —.22 58 32 34 —46 S 
11. House scholastic achieve- > 
ment —.01 —.02 19 52 27 —.25 —.23 —.02 39 —.16 09 
12. Efficiency of house organiza- 
tion for the contest 86 03 48 45 83 46.56 41 10) —.04 46 07 
13. Usefulness of suggestions 13 27 -80 77 72 —.12 il w G = or 65 42 
14. Contribution to the contest —.05 Al 46 .23 35 .09 003 —.15 a4 =—.21 -12 A2 4 54 
15. Strength of leadership 16 =—33 —.13 —07 —.13 =Al1 .007 AS 24 0 34 —.27 10 02 —.25 —.26 
3S 10 —.08 30 18 —.31 .54 02 37 16 


23 22 23 SS lil .60 .19 


16. Type of leadership 
15 73 S54 24 —06 31 —06 


17. Leader role of the counselor 
18. Support of formal leader S1 .30 


> 


Studies of Group Behavior 401 


Table 5 


Rank-Order Correlations of Ratings of Participant- 
Observers’ Reports by Two Judges 


R (N = 13) 

Variable Rho 
Enthusiasm for task 84 
Evidence of personality conflict 8 
Efficiency of house organization 81 
Strength of leadership 89 
Type of leadership SL 
Leader role of counselor (also 

46 


participant-observer) 


for the behavior in question. It is evident 


from the high intercorrelations of these three 
variables in Table 3 that the participant-ob- 
servers seem to be working within the frame- 
work of a general halo about the girls under 
their observation, with insufficient specificity. 

Variable 14 represents the judgment of the 
actual participants in each house regarding 
their own contributions and the contributions 
of their house-mates in the task of preparing 
the report. This variable might be expected 
to show a marked degree of agreement with 
the judgments of the participant-observers, as 
indexed in Variables 2, 3, and 13. In looking 
at the three relevant correlations, however, it 
would not appear that the participants them- 
selves and the participant-observers were in 
close agreement on the behavior under con- 
sideration. Nor does the girls’ evaluation of 
each other’s contributions relate closely to the 
ratings of the judges who saw the progress of 
the contest through the narrative reports sub- 


mitted by the participant-observers (Variable 


14 vs, Variables 4, 10, 12, 15, 16, and 17). 

There does not appear to be extremely high 
agreement among the participants, the partici- 
pant-observers, and the judges who reviewed 
the entire range of participant-observer re- 
ports in the assessment of behaviors that on a 
priori grounds appear to be related. Whether 
this may be due to poor training of the raters, 
disjunctions between relative and absolute 
judgments, or psychological differences be- 
tween apparently similar behaviors cannot be 
determined from the data. It is likely that 
all three of these factors contribute to the low 
patterns of intercorrelations. 


Table 3 also permits an analysis of the com- 
munality of the groups of.predictor variables. 
Under the general rubric of “acceptance of the 
group goal,” for example, four specific vari- 
ables are included: volume of participation; 
extent of participation; assumption of re- 
sponsibility; and enthusiasm for the task. 
The six possible intercorrelations among these 
four variables appear near the top of Table 3. 
The average intercorrelation for this set of 
six, including the negative correlation, is ap- 
proximately .40—a correlation that does not 
warrant any assumption of great communality 
among the variables logically grouped as facets 
or coordinates of the general concept of “ac- 
ceptance of the group goal.” Furthermore, 
these four variables show a wide range of 
correlation with the criterion measure; only 
one of the four—enthusiasm—approximates a 
relationship significantly greater than zero for 


the N of 13. 

Similar statements can be made about the 
other groups of predictor variables, with re- 
spect to their patterns of intercorrelations as 


well as their correlations with the criterion 
variable. 

These data have some bearing on the con- 
ceptualization phase of designing experiments 
on small groups, since the concept is in part 
given meaning by the choice of behavior 
deemed to be coordinate with it. In Table 
3, for example, six possible behavioral indices 
of “satisfaction with previous group experi- 
ence” are presented, and the intercorrelations 
among them are shown. In addition, when 
each is correlated with the criterion, only two 
(satisfaction with Village life and ratio of in- 
group to out-group choices for “social sup- 
port” on the first sociometric schedule) ap- 
proximate significance in predicting produc- 
tivity with the N of 13. A posteriori, some 
of the attenuating factors become evident: 
the six behavioral indices are drawn from 
different time points in the experimental 
year; they are assessed or measured with 
indeterminate accuracy and by different ob- 
servers; they are drawn from behavioral 
domains, in terms of low intercorrelations of 
sociometric data, showing considerable and 
surprising specificity (2). 

The problem of identification, accurate as- 
sessment, and meaning of variables, is certainly 


402 Darley, Gross, and Martin 


not a new one in psychology; it applies with 
equal cogency in research on groups as well as 
research on individuals and it emphasizes the 
importance of careful derivations of behavioral 
coordinates in testing the relationships in- 
volved in group processes. 

The Predictions of Productivity. Table 3 
permits the choice of several possible com- 
binations of variables for use in a multiple 
regression equation, in such a way that the 
earlier hypothetical formulation can be sup- 
Ported with varying degrees of confidence. 
For example, the Wherry-Doolittle shrinkage 
formula’ can be used to estimate the R with 
Productivity that might be obtained from a 
multiple containing: volume of participation; 
Satisfaction with Village life; achievement; 
and percentage of votes received by the formal 
leader. These four variables commend them- 
selves as relatively objective measures of 
factors crucial to the test of the general formu- 
lation of the problem as given at the start of 
this paper. Two of these four give as an 
estimate of R a value of -676; these two are: 
satisfaction with Village life and percentage 
of votes received by the formal leader. The 
addition of the other two variables shrinks 
the results to a lower estimate of R, 

But the following three variables give an 
estimated R of .867: satisfaction with Village 
life; percentage of votes received by the formal 
leader; and ratio of in-group to out-group 
choices for “social support” on the sociometric 
questionnaire given at the start of the school 
year. Note that two of these three variables 
are drawn from the same category of variables, 
dealing with “satisfaction with previous group 
experiences.” 

An estimated R of .88 is obtained by use of 
the following four variables: enthusiasm as 
estimated by the judges who read the partici- 
pant-observers’ reports; evidence of personality 
conflict as seen by the same judges; efficiency 
of house organization for the contest; and 


Stead, W. H., Shartle, C. L., et al. Occupational 
counseling techniques. New York: American Book 
Company, 1940, pp. 245-51. The assumption that 
rank-order correlations are direct equivalents of Pear- 
sonian values has been made in using this formula, 
even though the authors realize it is open to question, 
since the problem here is the relative changes in R esti- 
mates from alternative selections of variables in the 
same matrix. 


amount of leadership judged to have been 
exercised by the counselor. 

Another estimated R has been established 
on still another combination of variables, vach 
of which is drawn from one of the four general 
rubrics of the original hypothetical formulation 
and each of which shows a substantial corre- 
lation with the criterion of productivity: 
enthusiasm for the task; satisfaction with 
Village life; efficiency of organization for the 
task; and percentage of votes received by the 
formal leader. The R value is .896. When 
an estimated R is computed using only three 
of these four variables (satisfaction, efficiency 
and formal leadership support) the multiple 
R becomes .904., ; 

It appears that in this particular experi- 
mental setting and with the criterion task here 
defined, it is possible to identify factors con- 
tributing to the relative productivity of the 
thirteen groups. The estimated -R’s will 
differ according to the variables chosen for 
inclusion in the formula; in one instance, at 
least, a set of variables not on a priori grounds 
completely consonant with the conceptualiza- 
tion yields a higher estimated R than sets of 
variables more closely related to the hypotheses 
on logical grounds, However, since one vati- 
able alone—efficiency of house organization 
carries the greatest predictive weight, the 
multiple regression solution is somewhat un- 
economical here. The study has raised more 
problems than it has solved, one of which 
warrants mention. The multiple regression 
solution assumes a linear and additive relation 
between the independent variables and the 
criterion; it requires also that enough inde- 
pendent variables be included to account for 
all or almost all criterion components. Even 
if, in replications, the criterion components 
have varying weights for other samples of 
groups, the beta weights of the independent 
variables should still remain relatively con- 
stant. It is essential, however, to consider 
whether productivity is a linear and additive 
resultant of various group factors that can be 
hypothesized, or whether alternative formu- 
lations are closer to the reality of group be- 
havior. 

If linearity is assumed, a possible next steP 
the present study would be the factorial 
uction of the variables represented in Table 


in 
red 


E 


ve 


Studies of Group Behavior 


3, including the criterion variable. This 
solution, however, would be meaningful only 
to the extent that one is dealing with reliably 
estimated and conceptually relevant behaviors 
of a sample of groups known to be repre- 
sentative of some population of groups, and 
to the further extent that the criterion is 
truly a meaningful task for the groups in- 
volved. 

As a result of discussions of such problems 
of group productivity, Schachter set up a more 
incisive experiment on the relations between 
cohesiveness and effect of group induction to 
maximize or restrict group productivity (3). 
His study should be read in connection with the 
present report, since it introduces a variable 
not dealt with here—differential effect of group 
induction in high and low cohesive groups. In 
brief, he finds no necessary relationship be- 
tween cohesiveness and high productivity. 
Group members will accept induction either 
to increase or decrease production. Cohesive- 
ness appears to be a determining variable in 
the “slow-down” condition but not in the 
“speed-up” condition. When the group in- 
duces forces to increase production, both high 


403 


and low cohesive subjects accept group in- 
duction and increase their output markedly. 
When the group induces forces to decrease 
production, the high cohesive subjects are 
more accepting of group induction and conse- 
quently less productive than low cohesive 
subjects. While the designs of these two 
studies are different, they deal with the same 
general problem and together may form the 
basis for further research on group produc- 
tivity. 

Received January 5, 1952. 


References 


1. Darley, J. G., Gross, N., and Martin, W.E. Studies 
of group behavior: stability, change, and inter- 
relations of psychometric and sociometric vari- 
ables. J. abnorm. soc. Psychol., 1951, 46, 565- 
576. 

2. Martin, W. E., Darley, J. G., and Gross, N. Studies 
of group behavior: methodological problems in 
the study of interrelationship of group members. 
Educ. psychol. Measmt., 1952, 12, Winter Num- 
ber. 

3. Schachter, S., Ellertson, N., McBride, Dorothy, and 
Gregory, Doris. An experimental study of co- 
hesiveness and productivity. Hum. Relat., 1951, 


4, 229-238. 


MMPI Personality Patterns for Various College Major Groups 


* Ralph D. Norman and Miriam Redlo 


The University of New Mexico 


A recent study by Daniels and Hunter (3) 
raises two questions which are important in 
guidance, namely, whether there are personal- 
ity patterns which gravitate toward certain 
occupations, and also whether there are fixed 
“personality demands” in various jobs. It 
is the purpose of the present research to ex- 
tend these questions to the college student and 
his choice of major subject. Do students who 
select a major in business, say, have certain 
personality characteristics in common and are 
they different from those who choose art or 
psychology? It was decided to use the 
MMPI to help answer these questions since 
it has validity in pointing out vocational 
personality trends (1, 8, 12, 22, 23), although 
Lough (13, 14) expresses doubt about this. 


Subjects and Procedure 


s Subjects were 149 male seniors and graduate 
students at The University of New Mexico, 
chosen on the assumption that if personality is 
related to choice of major, the advanced stu- 
dents would exhibit the most pronounced char- 
acteristics. They were distributed as follows: 
psychology, 17; sociology, 3; mathematics, 2; 
chemistry, 4; physics, 12; engineering, 29; 
anthropology, 22; business administration, 23; 
art, 16; music, 1; geology, 8; biology, 4; physi- 
cal education, 2; history, 3; English, 1; ele- 
mentary education, 1; Inter-American Affairs, 
1. These majors were grouped together be- 
cause of logical relationships into seven major 
groupings: psychology-sociology (PS), 20; 
mathematics-chemistry-physics (MCP), 18; 
engineering (Engr), 29; anthropology (Anth), 
22; business administration (BA), 23; art-music 
(AM), 17; geology (Geol), 8. The remainder 
constituted a miscellaneous grouping which 
was not contrasted with the others because of 
its heterogeneity, except that it was included 
in the total group when each grouping was 
contrasted with the total minus itself. 

All Ss took the MMPI, and completed a 
questionnaire, rating their degree of satisfac- 
tion with their major on a 7-point scale and 
indicating which major they would choose if 
they could reselect it. These latter data were 
used because it was hypothesized that those 
who indicated greatest satisfaction would ex- 
hibit the most definite or pronounced person- 


sS 


ality pattern for their own grouping. Simi- 
larly, those rechoosing the same or similar 
majors, it was felt, should be more like their 
own grouping than those rechoosing differently. 

Results from the attitude scale revealed that 
it differentiated in numbers largely between 
“strongly satisfied” and “satisfied” or less, 
probably because of natural reluctance to ad- 
mit dissatisfaction with major after one hai 
spent considerable time and effort with it. 
Data for all 149 cases showed the follow- 
ing distribution: strongly dissatisfied, 0; die 
satisfied, 0; mildly dissatisfied, 3; neutral, 4; 
mildly satisfied, 14; satisfied, 66; strongly sat- 
isfied, 62. Consequently, it was decided to 
compare the strongly satisfied (hereafter SS 
groups within each major against the satisfied- 
and-less (hereafter LS). 

Also, data on rechoice of major were ana- 
lyzed into 3 categories: “different” (e.g- 4 
student might indicate a rechoice of a com; 
pletely different major, as from Engr to BA); 
“different but similar” (e.g, a rechoice 9% 
chemistry instead of physics); and ‘‘same- 
Two “undecided” cases were not used, nor WaS 
the Misc grouping, the reason therefor being 
evident below. 

By use of the ¢ test, the mean score of each 
major grouping was compared with the mean 
score of every other major grouping on all 
MMPI Scales (except ? Scale); also, the mea” 
score of each grouping was compared with the 
mean of the total group less the grouping in 
question. Similarly, the means of SS grouP® 
were contrasted with those of LS for eac 
major grouping and for the total group. S 
tistical analysis of the data concerned wit 
rechoosing majors was more difficult since a 
numbers in the categories (other than “same 
were very small. Hence it was decided to uS? 
the mean T-score deviation on each scale from 
the mean T-score of the related major grouping: 
The trend of deviations might then sugges 
whether or not those who would rechoos® 
different majors deviate more markedly from 
the means of their major groupings than those 
who would reselect the same major. 


Results 


1. Results for Major Groupings Compared with 
Each Other and with “Total”! Table 


presents the means and sigmas of T-scores ° 


1 “Total” refers to th 
particular subgrouping. 


404 


: e 
e total group of 149 minus th 


MMPI Personality Patterns for College Major Groups 


Table 1 
aa Means and Sigmas of T-Scores of Each Major Grouping and Total Group on the MMPI 
Major Grouping | N w w K m D, Hy pa ar 
Psychology and Or z4 4s WS ST LB SAO SOLO a ee 
Sociology = ig ai BY 53 98 67 SAA TIS N Ca 
Mathematics, Chem. 18 M 34 42 558 54.9 58.2 56.9 55.0 61.9 53.3 55.8 52.8 55.8 
and Physics d 92.23" 64° 71 108 67 96 94 7.2 85 80 9.8 
Engineering 2 M 23 33 563 53.1 543 55.3 55.2 59.2 54.9 55.8 54.7 54.6 
fis x F- AN 27 BA 87 109 71 11.5 10.2 67 11.7 104 103 
Anth logy 22 M 30 5.5 57.1 53.0 55.1 56.8 55.0 647 54.6 546 55.5 58.3 
ae ond? 13.0) 80a) 86» 89 65° 10.0 grt TA 18:2) TOA RUNO 
Busi A inis- 23 M 31 44 58.0 56.1 49.4 58.5 59.0 59.8 54.2 57.8 57.0 60.8 
twig a2) O 0A 9.0 GS “Tay dee wot a7 OS? 
i ` 55 55 72.7 56.9 57.6 61.7 62.1 
Art and Mus 17 M 38 4.6 61.0 55.9 55.1 61.1 60.8 72. 
-m eae ME OO. 28 8. 114 113 105 9.7 86 46 81 8.6 84 
8 M 26 5.1 61.3 52.4 51.8 61.0 62.6 64.0 54.1 59.4 60.8 63.4 
iq “ee ome iy 3:1 33 778 LO TS. 8.6 69 10.7 11.7 11.0 8.9 
5 5. 5 bs 53.4 56.8 57.2 59.0 
T includ. 149 M 28 46 57.6 54.1 53.9 57.8 58.1 62.7 5 
Wire Mutoh) er Goad 94 0 87 105) PINSE 7.8 10.2 10.5 10.0 


* Raw scores. 
j i : - 5 falls down in 2 and is indecisive in 1 
each major grouping and the total group of 149 > cases, 
2 2 li ionificant (Pa for PS) where the two means are almost 
Ee es eae EA The best conclusion we may make 


differences, checked for homogeneity of vari- identical. s x vama 
ance, among the major groupings, and between here 1s that there is a tendency tor groups 


the particular major grouping and the “total” to be more like their own major groupings on 
group. Significant differences, discussed later, certain discriminative scales than LS groups. 
appear for all scales except Pt and Hs. _ When the total group of 149 students was 
2, Results for SS Compared with LS. No divided into SS and LS, 2 statistically sig- 
significant differences appeared when means nificant MMPI differences emerged. Regard- 
for these 2 groups were compared within each less of major, the 65 in the SS group exhibit 
grouping. The small N’s and perhaps our more feminine interests (Mf mean 65.02) 
Aner of dichotomizing the groups may have than do the 87 in the LS group (Mf mean 
affected this approach. Accordingly another 61.01), the difference being significant at the 
approach was tried. In Table 2, 8 instances 2 per cent level (t of 2.42). In addition, the 
occurred wherein the scales discriminated be- SS group was significantly less psychasthenic 
tween the particular major grouping and the (Pt mean 54.85) than the LS group (Pi mean 
“total.” It might then be hypothesized that 58.23), the difference being significant at the 


the SS group within each major should have 
a mean score in relation to (i.e. higher or 
lower than) that of the LS group similar to 
that of the major grouping itself in relation to 
“total.” Table 3 gives results of this contrast. 
According to Table 3, the prediction holds in 


2 Except for L and F Scales whi E 
i which are reported in ra 

score form since Hathaway and Meehl (il) state the 

original T-score tables for these were inaccurate. 


5 per cent level (t of 2.17). The SS group 
also had a significantly higher major grade 
point average (2.23) than the LS group 
(1.79), the £ being 4.40, P < .001. 

3. Results Concerned with Rechoosing Majors. 
As noted above, it was possible, when students 
were asked about rechoice, to separate each 
grouping into 3 categories. In view of small- 
ness of V’s, no statistical technique was em- 


406 Ralph D. Norman and Miriam Redlo 
Table 2 
Significant Differences (£s) between Major Groupings on the MMPI 
L F K D Hy 
PS: 267 Anth, 2.58* AM, 2.05* MCP,3.11** a 
AM> | Engr, 2.78** Engr< MCP< BA< | Anth, 2.37* AM>Engr, 2.19 
Tot, 22,27* Tot, 2.54* Geol, 2.19* Tot, 2.99**} 
Pd Mf Pa Sc Ma 
PS, 3.75** 
MCP, 3.43** Engr, 3.11* PS, 2.19" 
MCP, 2.30% Anth, 3.07** Anth, 2.58* PS, 2.30" BA, 2.04" 
PS> | Engr, 2.35} AM>|BA, 441** PS<|BA, 2.15* MCP<|AM, 3.07** Engr< | AM, 2.45" 
Anth, 2.41*} Geol, 2.39* AM, 4.84** Tot, 1.96* Geol, 2.15" 
Engr, 4.38** Tot, 3.34** Tot, 2.71 
Tot, 4.71** AM>Engr, 2.26* 
Anth>Engr, 2.12* 
Engr<Tot,  2.09* 


Tot = Total group minus contrasted subgroup. 
* Significant at .05 level. 

** Significant at .01 level. 
t Corrected for heterogeneity of variance. 


ployed to test for significance but in 5 of 7 
instances (all except Anth and Engr), those 
students who would select different majors if 
given the opportunity to choose again deviated 
more from their own major grouping than those 
who would rechoose the same major. The 
trend is perhaps more pronounced if the total 
group of 135 who made a definite rechoice is 
considered. Mean T-score deviation from 
major grouping is greatest for “different” 
(N = 13), M = 7.55; next greatest for “differ- 


Table 3 


Strongly Satisfied (SS) Compared with 
Satisfied-and-Less (LS) 


ss LS Predic- 
Prediction from tion 
Table 2 N M N M Verified? 


L AM>“total” 14 34 3 57 No 
F  Engr<“total” 5 22 24 3.6 Yes 
D BA<“total” 8 48.3 15 50.1 Yes 
Mf AM>“total”? 14 68.3 Yes 
Mf Engr<“total” 5 56.2 24 59.9 Yes 
Pa PS<“total” 7 
Sc MCP<“total” 4 

5 


Ma Engr<“total” 53.8 24 54.7 Yes 


ent but similar” (N = 8), M = 7.28; and 
least for “same” (N = 114), M = 6,60, ‘These 
results must be considered suggestive rathet 
than conclusive of the fact that personality 
needs exist in academic choice, 


Discussion 


Discussion of results will not be concerned 
with explaining every significant difference #” 
Table 2, for when 336 different Xs are run + 
18 possible that some may reach a level ° 
significance by chance. Rather, to be CO? 
Servative, it would be more pertinent to aie 
cuss frends evidenced in Table 2 where 918° 
nificant differences were found at least 3 time 
between a grouping and other groupings ° 
between a major grouping and the total. , 

\1. Lie Scale. On this scale, AM is 58, 
nificantly greater than PS, Engr, and “total:/ 
Although this scale has not been analyz 
clinically, Hathaway and McKinley (10) sta 
that high Z results when the subject choose? 
responses which make him most soci® 
acceptable. It is of interest here that Grays? 
(7) points out that high Z is usually accom 
panied by high Hy. The AM grouping sho" 


MMPI Personality Patterns for College Major Groups 407 


both highest L and highest Hy, and on both 
scales differs significantly from Engr which 
has lowest L and Hy. Lying and hysterical 
reactions are both socially oriented, implying 
sensitivity for applause of the crowd, a need 
which is probably greater in artists than 
engineers (e.g, artists put their products 
on public display). Moreover, the artist 
converts reality into fantasy, and from his 
earliest training precise regard for realism is 
not encouraged. This is quite the opposite of 
Engr’s training. 

2\F Scale. On this scale, Engr is sig- 
nificantly lower than “total.”; Hathaway and 
McKinley (10) contend that high F may be 
due to carelessness and low F indicates that 
S’s responses were rational and relatively 
pertinent. Engineering is a precise, careful, 
rational field, completely pertinent to practical 
matters. It is indeed possible that the low F 
of Engr is a carry-over of his training. It is 
of interest here that Cook and Wherry (2) 
found a Mechanical Coordination Factor, 
resulting from a factor analysis of MMPI and 
aptitude tests, had a significant positive load- 
ing on F. They state that subjects with good 
mechanical coordination were meticulous in 
answering personality questions. 

13. D Scale. BA is significantly lower than 
MCP, Anth, AM, and “total.” Our results 
are similar to those of Paterson (15), Trabue 
(21), and Dodge (5, 6) who found sales-people 
highly socially dominant. Harrower and Cox 
(9) found insurance salesmen giving evidence 
of considerable drive. As compared to other 
groups, the presence of depressive trends in 
BA would be a definitely serious liability. 

‘4, Pd Scale. PS scores significantly greater 
than MCP, Engr, Anth, and “total.”s It 
seems paradoxical that PS should have a 
higher Pd implying 4 tendency toward psycho- 
pathic characteristics. Perhaps the best ex- 
planation is rather obvious. In the psycho- 
path, freedom of expression is paramount—he 
aeee convention and tradition. Since: sock 
ology and psychology attempt to blame many 
of the present taboos for mental illness and 
social problems, it is possible that the PS 
group feels that it must defy convention. Or 
perhaps as a result of training PS does not feel 
any reluctance about answering Pd items 
truthfully or may even interpret them differ- 


ently. These attitudes would be picked up 
by an inventory like the MMPI. 

35. Mf Scale. AM is significantly greater 
than all other groupings and “‘total.’y A valid 
explanation would be that sensitive or cultural 
work has been characterized as feminine, and 
since art requires both, AM will naturally 
exhibit higher Mf. Roe (17), studying artists, 
states that they have a type adaptation which 
is decidedly non-aggressive, hence more femi- 
nine than masculine. However, she does state 
further that passivity is more characteristic of 
intellectual males; this statement is in line 
with our finding that Mf is the highest of all 
for our total group of 149. On this scale also 
Engr scores significantly lower than “total,” 
a not unexpected result in view of the “mascu- 
linity” of engineering. Cook and Wherry (2) 
found a negative loading on mechanical 
aptitude for those with high Mf. 

16. Pa Scale. PS scores significantly lower 
than all other groups (except Geol), and 
“total.” r It seems that no explanation in 
terms of personality is required. The sophis- 
tication of PS enables it to select out the Pa 
questions, especially in view of the fact that 
the paranoid is a dramatic type. 

17. Sc Scale. MCP is significantly lower 
than PS, AM, Geol, and “total.”’] It is prob- 
ably true that the reality of MCP represents a 
rather unique reality as compared to less pure 
and emotionally rarefied fields. The nature 
of this reality is in no way comparable, how- 
ever, to schizoid reality. The former is 
logical, rational, intellectual and highly ab- 
stract, while the latter is bizarre, irrational, 
emotional and concrete. 

\8. Ma Scale. Engr scores significantly 
lower than PS, BA, AM, Geol, and “total.” 
Daniels and Hunter (3) state that the work 
needs of the Ma pattern are related to an outlet 
for enthusiasm and a high degree of overt 
activity. We feel that both MCP and Engr 
require the least overt activity in comparison 
to other groups; they do not require as much 
social orientation as PS, e.g., or as strong an 
outlet for enthusiasm as AM, BA, or Geol. 
The engineer, and also MCP, have a definite 
concept of reality and their place in it, and 
therefore, unlike the hypomanic individual 
they do not shift their attention constantly. 
However, our results on Engr and the Ma 


408 


Scale appear at variance with a finding by 
Cook and Wherry (2). They found a factor 
labeled Tendency to Over-Activity with posi- 
tive loadings on mechanical aptitude and 
knowledge, and state that over-active indi- 
viduals find outlet in mechanical pursuits. 
However, the difference between their sub- 
jects, naval enlisted submarine candidates, 
and ours may explain this discrepancy. 

An attempt may now be made to describe 
tentatively each academic group in terms of 
personality »characteristics or of personality 
demands of the major. 

1. PS. This group creates somewhat of a 
problem since their knowledge of psycho- 
logical dynamics may have contaminated 
their profiles. The MMPI indicates that, as 
a group, PS may be characterized by fairly 
strong Pd tendencies and corresponding Ma 
behavior. Roughly, their orientation is out- 
going with expressed disregard for convention. 

2. MCP. This group indicates a pattern 
of tendencies away from Pd and Ma, with 
somewhat high D and very low Sc. Justifica- 

tion for this pattern may be found in Roe’s 
Rorschach work on scientists. She believes 
that men with dominantly non-verbal ability, 
who seem to find satisfaction in empathy with 
symbols of things rather than things them- 
selves, are men who are suited for and at- 
tracted to science (19). Therefore, for MCP, 
the conclusion might be suggested that very 
weak Pd, Ma, and Sc tendencies are necessary 
for success since these sciences are concerned 
with symbols, ideas, and abstraction, rather 
than with people. The exacting nature of 
these sciences may contribute to the presence 
of somewhat higher D, a finding similar to 
that of Roe (20), who reports physicists with a 
high degree of anxiety because, of the nature 
of their work. 

3. Engr. Tendencies similar to MCP are 
found in Engr. This group displays very low 
Pd, Ma, and Hy. Their Sc is a little higher 
than MCP, but lower than the other groups. 
Explanations similar to those given for MCP 
appear to apply to engineers, except that they 

are the most “masculine” (lowest Mf), a find- 
ing to be expected in view of their mechanical 
interests. The personality demands of this 
group seem to imply a stability of behavior, 


Ralph D. Norman and Miriam Redlo 


low overt activity, reduced interest in people, 
and masculinity. 

4. Anth. This group does not indicate 
strong patterns. Anth has a relatively high 
Mf score which could mean that this field re- 
quires a high degree of sensitivity, a finding in 
agreement with that of Roe (18) in her study 
of vertebrate paleontologists. 

5. BA. As a group, BA may be charac- 
terized by low D, accompanied by Ma tendency 
and social aggression, a factor resulting from 
the comparatively low Mf score. It is obvious 
that the business world demands the opposite 
of what McKinley and Hathaway (10) say 
about high D, namely, “poor morale . . . feel- 
ing of uselessness . . . inability to assume 
normal optimism.” Business also demands 
aggressive tendencies which should show up 
in lower Mf. 

6. AM. This group is represented by ® 
complexity of characteristics. Extremely 
strong feminine sensitivity (high Mf) is ac- 
companied by high Hy, Sc, Pd, Ma, and 
somewhat Pa tendencies. The generally high 
mean scores of AM suggest that, as a group, it 
is somewhat more maladjusted than the others, 
and may find outlets through artistic sub- 
limations or compensations as suggested by 
Prados (16). In view of the complex pattern 
of AM, it is difficult to state any personality 
demands for it, except the high femininity. 

Because of the small V in Geol, no attempt 
will be made to define personality character- 
istics or demands. 
are not similar to MCP, but we cannot con- 
clude whether or not this is an artifact of 
our sampling. 

This study also noted the fact that SS stu- 
dents were significantly higher in Mf than LS. 
Since it has been demonstrated before (4) that 
college males have Mf scores higher than 
average males, this result is not surprising. 
Roe (17) states that feministic adaptation is 
characteristic of the sensitive intelligent male, 
and that intellectual pursuits become a refuge 
for men who deviate from the pattern of mascu- 
linity. College work would thus be more 
satisfying to the somewhat more feminine 
male. 

We have also noted that LS scores sig- 
nificantly higher than SS on P?. Hathaway 


However, these few cases - 


i 


MMPI Personality Patterns for College Major Groups 


and McKinley (10) state that a Pi tendency 
may be manifested via mild depression, €x- 
cessive worry, lack of confidence, and in- 
ability to concentrate. These character- 
istics should certainly be less prominent 
among SS students than among LS. 

Since ours is an exploratory study, with 
small N’s, we cannot suggest that patterns 
found here be used for guidance purposes: 
Research on a much larger scale is needed in 


order to confirm and expand these findings. 


Summary 


Seven groupings of students majoring in 
different academic subjects were contrasted 
th a “total” grouping 


with each other and wi a 
minus the particular subgrouping on the 
ated their satisfaction 


MMPI. They also t 1 
with their major subject and mentioned their 


choice of major if allowed to rechoose. Princi- 
pal findings were: 

1. The MMPI is valid for distinguishing 
personality trends amongst various major 
groupings. Certain scales significantly dis- 
criminated major groupings from the re- 
mainder of the students. 

2. There is a tendency for students who are 
strongly satisfied with their major to resemble 
their own groupings on discriminative scales. 

3. Significant differences were found be- 
tween strongly satisfied and satisfied-and-less 
students on Mf and Pi. The former were 


higher on Mf, the latter on Pt | 
4. When mean T-score deviations from 


average T-scores are calculated, there is a 
tendency for students who would rechoose the 
same major to deviate less from their own 
groupings than those who would rechoose a 
different major. 


Received December 24, 1 951. 


References 
1. Chyatte, C. Personality traits of professional 
actors. Occupations, 1949, 27, 245-250. } 
A factor analysis 


2. Cook, E. B., and Wherry, R- J- 
of MMPI and aptitude test data. 
Psychol., 1950, 34, 260-266. 


J. appl 


Se 
ye 


4. 


10. 


ub ts 


. Dodge, A. F. 


. Harmon, L. R., and Wiener, D. N. 


. Lough, Orpha M. Tı 
. Lough, Orpha M. 


. Paterson, D. 


. Prados, M. Rorschach studies on artists. 


. Roe, Anne. 


. Roe, Anne. 


. Roe, Anne. 


409 


Daniels, E. E., and Hunter, W. A. MMPI per- 
sonality patterns for various occupations. J. 
appl. Psychol., 1949, 33, 559-565. 

DeCillis, Olga E., and Orbison, W. D. A compari- 
son of the Terman-Miles M-F Test and the Mf 
Scale of the MMPI. J. appl. Psychol., 1950, 
34, 338-342. 

Social dominance and sales person- 
ality. J. appl. Psychol., 1938, 22, 132-139. 

Dodge, A. F. What are the personality traits of 
the successful salesperson? J. appl. Psychol., 
1938, 22, 229-238. 


. Grayson, H.M. The Minnesota Multiphasic Scales. 


Mimeographed, no date. 
Use of the 


MMPI in vocational advisement. J. appl. Psy- 
chol., 1945, 29, 132-141. 


. Harrower, G. J., and Cox, K. J. The results ob- 


tained from a number of occupational groupings 
on the professional level with the Rorschach 
group method. Bull. Canad. Psychol. Assn., 
1942, 2, 31-33. 

Hathaway, S. R. and McKinley, J. C. Manual 
for the Minnesota Multiphasic Personality Inven- 
tory. N.Y.: Psychol. Corp., 1943. 

Hathaway, S. R., and Meehl, P. E. An atlas for 
for the clinical use of the MMPI. Minneapolis: 
U. Minn. Press, 1951. 


. Lewis, J. A. Kuder Preference Record and MMPI 


scores for two occupational groups. J. consult. 


Psychol., 1947, 11, 194-201. 
eachers college students and 
the MMPI. J. appl. Psychol., 1946, 30, 241- 
247. 
Women students in liberal arts, 
nursing, and teacher training curricula and the 
MMPI. J. appl. Psychol., 1947, 31, 437-445. 
| Research studies in individual 
diagnoses. Univ. Minn. Bull. Emplyt. Stab. Res, 
Inst., 1934, 3, No. 4. 

Ror- 
schach Res. Exch., 1944, 8, 178-183. 


. Roe, Anne. The personality of artists. Educ. 


psychol. Measmt., 1946, 6, 401-408. 

A Rorschach study of a group of 
scientists and technicians. J. consult. Psychol., 
1946, 10, 317-327. 

Personality and vocation. Trans. 
N. Y. Acad. Sci., 1947, Series II, 9, 257-267. 
Analysis of group Rorschachs of physi- 


cal scientists. J. proj. Tech., 1950, 14, 385-398. 


. Trabue, M. R. Occupation ability patterns. Per- 


sonnel J., 1933, 11, 344-351. 


. Verniaud, Willie M. Occupational differences in 


i MMPI. J. appl. Psychol., 1946, 30, 604- 
613. 


. Wiener, D. N., and Simon, S. Personality charac- 


teristics of embalmer trainees. J. appl. Psy- 


chol., 1950, 34, 391-393. 


The Effects of Tachistoscopic Training in an Adult Reading Program’ 


George Manolakes 


Division of Research, State Education Department, University of the State of New York 


The role of the tachistoscope in a reading 
improvement program has been the subject of 
much discussion. Since the tachistoscope has 
been one of the training devices used in the 
Reading Improvement course at the Marine 
Corps Supply Schools, this study was initiated 
in order to determine the effects of omitting ta- 
chistoscopic training from the course of in- 
struction. 


Design of Experiment 


Accordingly, a study was designed which 
would provide for an Experimental and a 
Control group to undergo training in reading 
improvement simultaneously. The variable 
element within their instruction was the ex- 
clusion of tachistoscopic training from the pro- 
gram of the Experimental group, and the ex- 
tension of instruction in vocabulary and 
comprehension skills. 

The effects of the variable courses of in- 
struction were measured in terms of the 
changes that would normally be expected from 
training with tachistoscopic devices. The 
Ophthalmograph was selected as the means 
by which certain of the visual skills might be 
measured quantitatively. Each student was 
tested with the Ophthalmograph prior to in- 
struction and at the conclusion of the training 
period. The film records of the eye move- 
ments were utilized in determining: (a) the 
mean number of fixations; (b) the mean num- 
ber of regressions; (c) the mean span of recog- 
nition; and (d) the mean duration of fixations. 

In addition to the changes in visual skills, 
changes in reading rate have also been at- 
tributed to tachistoscopic training. As an 
index of reading rate, the Silent Reading Check 
RRRP G8-5 was administered prior to in- 
struction and the Silent Reading Check RRRP 


1 author wishes to express his gratitude to Colonel 
E W Shelberne, Director and Lt. Col. E. B. Watson, 
Director of Instruction, Marine Corps Supply Schools, 
whose interest and efforts made this project possible. 
In conformance with existing Naval Regulations, the 
opinions or assertions contained in this article are the 
private ones of the writer and are not to be construed 
as official or reflecting views of the Navy Department 
or naval service at large. 


2 


G8-6 was used at the conclusion of the pro- 
gram. The data were limited to reading rate 
since it was felt that this would be a more ap- 
propriate index since tachistoscopes are more 
related to the perceptual skills rather than 
those in comprehension. , 
Population. The population to be studied 


was a group of Marine Corps officers under in- ` 


struction at the Marine Corps Supply Schools. 
The group of 34 officers included 1 captain, 8 
first lieutenants, 24 sécond lieutenants, and 1 
commissioned warrant officer. 7 

The officers were divided into an Experi- 
mental and a Control group on the basis of: 
(a) age, (b) raw scores on the Olis Self Ad- 


Table 1 


Comparison of Experimental and Control Groups 


Experimental Control 
Group Group 
Variables Mean Mean 
Age 25.1 25.0 
Otis Test 56.2 56.5 
Rate (words per minute) 351.8 352.4 
Comprehension (per cent) 84.1 83.5 
Education in Years 14.7 14.9 


ministering Tests of Mental Ability, Higher 
Examination: Form B for High Schools and 
Colleges; (c) initial reading rate as determined 
from Silent Reading Check RRRP G8-5; (d) 
comprehension scores on above test; and (e) 
the mean educational level in terms of years of 
school completed. ‘Table 1 shows the mean 
Scores for each group. 

In addition to these factors, others such as 
the number of regular and reserve officers, 
years of service, active and inactive, were 
also considered and equated. 


Program of Instruction 


Total Group. The group received an hour 
of orientation which included a statement of 
objectives and procedures as well as familiari- 
zation with the various routine functions- 
There were two hours devoted to testing pre- 


410 


| 


= a 


Tachistoscopic Training in an Adult Reading Program 


viously described; i.e., visual, reading, and 


intelligence. At the conclusion of the course, 
an hour of critique was held for a final sum- 
may of progress. 

Reading Rate Controller. This device was 
used by both the Experimental and Control 
groups. Each group spent eighteen 25-minute 
training sessions on the Reading Rate Con- 
troller. 

Control Group. This group followed the 
normal course of instruction which included 
the Reading Rate Controller training de- 
scribed above. In addition, there were eigh- 
teen 123-minute training sessions with the 
tachistoscope. This training provided prac- 
tice at 5, 6, 7, 8, and 9 digits at 1/25 and 
1/100 of a second. Progression to the next 
higher series of digits was controlled by success- 
ful completion at each speed setting. Suc- 
cessful completion was interpreted to be 23 
correct from a group of 25 slides. The limits 
for competency were those adopted during 
the research phase of the reading program 
sponsored by the U.S. Navy Field Medical 
Research Laboratory. In addition to the in- 
struction and practice on the devices, there 
were nine 12}-minute periods devoted to the 
development of vocabulary and an equal 
number for the development of comprehen- 
sion skills. During the vocabulary periods, 
there was instruction presented in independent 
word attack skills and dictionary skills. 
Vocabulary checks were administered at each 
session which served to introduce new skills as 
well as to provide practice. In the alternate 
sessions, emphasis was placed upon the de- 
velopment of comprehension skills. The stu- 
dent was provided the opportunity to practice 
in a free reading situation those skills being 
forced by the controlled reading training 
periods. Comprehension exercises were pro- 
vided and served as the check valve for the 
tendency to overemphasize speed in reading. 

Experimental Group. The training received 
by this group differed in that no tachistoscope 
was provided. The time normally allocated 
to this training was included jn a broader 
program of training in vocabulary and compre- 
hension skills. 


Results 


The results presented are those dealing with 
the differences that existed between the Expert- 


411 


Table 2 


Summary of Mean Reading Rates in Words per Minute 


Initial Final Dif. sn—r: t 
Experimental 351.8 809.8 458.0 40.5 11.314* 
Control 352.4 571.8 2194 34.1 6.435* 
Diff. 0.6 238.0 
snar: 5.8 56.9 
t 0.1 4.184* 


* Significant at the 1% level. 


mental and Control groups in the area pre- 
viously discussed. The results with the ex- 
ception of reading rate shall be based upon 
the quantitative measurements made from the 
Ophthalmograph film records. 

Reading Rate. The changes that occurred 
during the training period are shown in Table 2. 

The data fail to indicate any significant 
difference between the groups prior to in- 
struction. At the end of the training period, 
a statistically significant difference between 
the groups does appear. It will also be noted 
that both groups made significant gains in 
reading rate as is indicated by their initial and 
final reading levels. In considering the total 
group, the Initial Reading Rate was 352.1 and 
the Final Reading Rate was 690.8. The 
difference of 338.7 and the standard error of 
33.3 resulted in a “0” of 10.165 which is sig- 
nificant at the 1 per cent level. 

Number of Fixations. In Table 3 there is a 
comparison made of the mean number of 
fixations for each group during the reading of 
a 100-word passage. 

There are no significant differences between 
the groups indicated either prior to or follow- 
ing the training program. There are statis- 
tically significant differences found within 


Table 3 


Summary of Mean Number of Fixations for 
100-Word Passage 


Initial Final Diff. sma—x t 
Experimental 77.5 52.1 25.4 3.6 6.987* 
Control 76.4 54.8 21.5 4.8 4.476* 
Diff. 1.2 2.7 
snr 4.9 3:7 
t 0.238 0.740 


* Significant at the 1% level. 


412 


Table 4 


Summary of Mean Number of Regressions for 
100-Word Passage 


Initial 


Final Diff. sx—z: t 
Experimental 11.8 5.2 6.6 1.9 3.439* 
Control 8.3 4.1 4.2 2.8 1.515 
Diff. 3.4 1.1 
Sars 3.1 1.5 
t 1.116 0.722 


* Significant at 1% level. 


each group between the initial and final means. 
As a total group, the mean number of fixations 
were reduced from 76.9 to 53.5. The “¢” value 
of 7.855 is significant at the 1 per cent level. 

Number of Regressions. Since the reduction 
of the number of regressions would be another 
expected outcome of visual training, the mean 
number of regressions was computed for each 
group from Ophthalmograph records. These 
are shown in Table 4. 

There are no indications of significant differ- 
ences with the exception of those found within 
the Experimental group. As a total group, 
the mean was reduced from 10.1 to 4.7. The 
“1” value for the difference was 3.220 which is 
significant at the 1 per cent level. 

Average Span of Recognition. This measure 
is directly related to the number of fixations 
since it is computed from them. It is pre- 
sented as additional information in Table 5. 

There are no indications of significant differ- 
ences between the groups either prior to or 
following instruction. There are significant 
differences noted within each group. As a 
total group, the mean was increased from 1.37 
words to 1.97 words. The “‘?” value of 9.071 
is significant at the 1 per cent level. 


Average Duration of Fixations. This meas- 


Table 5 


Summary of Mean Spans of Recognition in 
Number of Words 


Initial Final Dif. sn—m i 
Experimental 1.37 2.04 0.68 0.09 7.424* 
Control 1.37 1.90 0.53 0.10  5.503* 
Diff. 0.01 0.14 
Sn—22 0.08 0.13 
t 0.069 1.085 


* Significant at the 1% level. 


George Manolakes 


Table 6 


Summary of Average Duration of Fixations in Seconds 


Initial Final Dif. sa—x t 
Experimental 0.20 0.19 0.01 0.01 0.737 
Control 0.19 0.20 0.01 0.01 1.594 
Diff. 0.01 0.01 
Sn—2: 0.01 0.01 
t 1.539 0.641 


ure is derived from the number of fixations and 
the reading time for the 100-word passage- 
The data are shown in Table 6. 

The data present no basis for assuming any 
differences between the groups or within each 
group as a result of training. The total group 
initially had a mean of 0.19 second, and a final 
of 0.20. The difference was not significant, 
but the means indicate that there is agreement 


with the general acceptance of the average 
duration of fixations. 


Summary 


In comparing the results of the Experimental 
and Control groups, there were no significant 
differences between the groups in the reduction 
of the number of fixations, the increase of the 
span of recognition, the reduction o 
movements, or reduction of the 
fixations. There was a significant difference 
in reading rate at the conclusion of the train- 
mg program, but this was in favor of the 
Experimental group. Significant increases 
were observed within the groups upon com- 
paring the initial and final performances in the 
area of number of fixations, span of recogni- 
tion, and reading rate, 

1. In the adult population studied, the 
results failed to indicate that the Experi- 
mental group was penalized through a lack of 
tachistoscopic training. 

2. Although the areas measured were those 

which tachistoscopic training is expected 
to produce positive results, the data failed 
to provide any basis for rejecting the null 
hypothesis. 

3. Improvement of the visual skills meas- 
ured by the Ophthalmograph may be the result 
of the improvement in reading efficiency rather 


than contributing to this improvement iN 
reading. 


f regressive 
duration of 


in 


Received January 25, 1952. 


An, 


The Effect of Fluorescent 


Attilio Zaccaria, Jr., 


Flicker on Visual Efficiency . ae 


and M. E. Bitterman 


F University of Texas 


Although fluorescent lighting has come into 
increasingly widespread use, there is still con- 
siderable doubt about its suitability. Reports 
of strain and discomfort from workers and 
students required to perform visual tasks 
under fluorescent lighting are numerous (6, 12), 
but the source of these difficulties has yet to 
be established. As Luckiesh (8) noted, it is 
necessary to distinguish between the quality 
of light and the quality of lighting. Luckiesh 
himself believes that complaints about fluo- 
rescent lighting may be traced to inadequate 
control of glare, an opinion in which Hardy (4) 
concurs, and the well-known resistance of the 
public to technological change may play some 
role. The possibility remains, however, that 
intrinsic properties of the new source—pat- 
ticularly spectral characteristics and flicker— 
may be responsible in part for the reported 
difficulties. Questions relating to ultraviolet 
output, lag of emission in the yellow-green 
regions, and the significance of spectral lines 
have been widely discussed, but no generally 
acceptable conclusions have been reached (12, 
10, 7,5). Flicker is generally recognized to be 
undesirable, but many writers have mini- 
mized the importance of this factor in 60-cycle 
installations (12, 2). 

There has been surprisingly little direct 
study of visual functioning under fluorescent 
illumination. Luckiesh and Moss (9) found 
no significant difference in frequency of blink- 
ing during work under tungsten-filament and 
fluorescent daylight lamps and concluded that 
ease-of-seeing was the same under the two 
conditions. Unfortunately, however, the un- 
reliability of the index employed and its lack 
of relation to effort-expenditure in visual work 
casts doubt upon the validity of this con- 
clusion (20). Preference tests conducted by 
Holway and Jameson (6) gave a clear advan- 
tage to tungsten as compared with a variety 
of differently colored fluorescent sources: but 
the validity of preference tests cannot be 
pen at face value. Dark-adaptation was 

ound to be retarded by work under fluo- 


rescent as compared with incandescent light, 
but the difference disappeared when indirect 
lighting was used. In a study with the 
American Optical Company’s Sight-Screener, 
Gray and Prevetta (3) recorded changes in a 
variety of visual functions following two hours 
of reading under equally bright (20 foot- 
candle) fluorescent lighting ‘and daylight. 
No significant difference between the two con- 
ditions was found, but the possibility remains 
that differences might have been revealed by 
other measures. 

Tf research on visual functioning under fluo- 
rescent light has not been abundant, neither 
has it been sufficiently analytical; no attempt 
has been made to separate the two principal 
factors—spectral characteristics and flicker— 
which differentiate fluorescent from other 
sources. In the experiment here reported, 
which was designed to study the effects of 
flicker, that factor was isolated by comparing 
A.C.- and D.C.-operated fluorescent lamps.! 
Furthermore, there has been no attempt in 
previous investigations to insure that the 
amount of visual work done by the subjects 
under the conditions studied was the same, a 
difficulty which was remedied in the present 
study by the use of a visual task that provided 
an objective measure of performance. Finally, 
insufficient attention has been given to the 
selection of appropriate indices of fatigue and 
efficiency. This problem is the most crucial 
of all in contemporary research on illumination 
(1); performance measures are inadequate in- 
dices of fatigue and efficiency, and techniques 
for the evaluation of effort are still in a develop- 
mental stage (13, 19). In the present study, 
primary concern with the effects of fluo- 
rescent lamp flicker led to the use of CFF 
(critical fusion frequency) as an index of 
fatigue. There is some evidence that CFF is 
reduced by exposure to flickering light (18, 
11), and a variety of studies by Simonson and 


1In order to study the effects of spectral quality 
independently of flicker, all sources should be D.C.- 


operated. 


413 


414 


Table 1 
Design of Experiment 


Test Form Illumination 
Group Periodi Period 2 Period 1 Period 2 
1 I IL AG: D.C. 
2 As II D.c. A.C, 
3 I I A.C. D.C. 
4 Il í D.C. A.C. 


his colleagues (14, 15, 16, 17) suggest that a 
decline in CFF may be employed as a valid 
index of impairment. 


Apparatus and Procedure 


The visual tasks selected for the experiments 
were Forms I and II of Tinker’s Speed of Read- 
ing Test? This test consists of a series of 
numbered sentences, each of which contains a 
word which spoils the meaning of the sentence, 
and the task of the S is to locate and cross out 
these words. The sentences are simple enough 
to minimize the comprehension factor when the 
test is used with college students. The 
number of items correctly marked in a given 
period provides an objective measure of 
achievement. 

Twenty students at the University of Texas, 
12 males and 8 females, served in the experi- 
ment. Their vision was normal or corrected 
to normal. Each S took both forms of the 
test, 30 minutes being allowed for each form. 
Half the Ss began with Form I and half with 
Form II. Each S worked under both con- 
ditions of illumination, half beginning with 
A.C. and half with D.C. The procedure, 
therefore, required four subgroups, as shown 
in Table 1. 

The Ss were studied individually in a 
specially constructed, semicircular enclosure, 
6 feet high and 41 inches wide, which contained 
a writing surface. The entire surround was 
painted flat white. Three 15-watt General 
Electric daylight fluorescent lamps were 
mounted on the ceiling of the booth, well out 
of S’s visual field. The power-supply made 
it possible to operate the lamps, which were 


2 We are indebted to Dr. Miles A. Tinker for providing 
these materials. 


Attilio Zaccaria, Jr., and M. E. Bitterman 


wired in series, on either A.C. or D.C. The 
transition from alternating to direct current, 
and the reverse, were accomplished by the 
operation of a double-pole-single-throw switch. 
The level of illumination on the working sur- 
face was 20 foot-candles under each condition. 
A circular opening, 5/8 inches in diameter, was 
cut in the surround at eye-level. The opening, 
covered with a milk-glass plate, was 15 inches 
away from S, and behind it a General Radio 
Strobotac, Type 613-B, was set. With this 
apparatus the CFF determinations were made. 

S was uninstructed as to the purpose of the 
experiment. He was told that his speed of 
reading was being measured and he was 
asked to work as rapidly and as accurately as 
possible. For each S, fifteen CFF determina- 
tions (ascending order) were made in all—five 
before the reading began, five at the con- 
clusion of the first reading period, and five at 
at the end of the second period. 


Results and Discussion 


Table 2 shows the mean performance scores 
and CFF drops under the two conditions of 
illumination. The difference in performance 
was not statistically significant (¢ = .7, P>? 
.05). The drop in CFF under the D.C. con- 
dition was not significant (t = .75, P > .05); 
but the A.C. drop was hightly significant 
(t= 5.0, P<.01). The difference betwee? 
the two conditions in terms of CFF drop also 
was statistically significant (¢ = 2.9, P < 01). 
Changes in CFF for Groups 1 and 3 and Groups 
2 and 4 combined are plotted in Fig.1. 

A more thorough examination of the re- 
lationships revealed in the data was made in 
an analysis of variance. The principal pur- 
Pose of this analysis was to test the signifi- 


Table 2 


Speed of Reading and CFF Drop as a 
Function of Illumination 


S.E. 
AC. D.C. Dif. Dif. N + 
Reading Score 272.8 2704 24 34 20 1 
CFF Drop 1.25 25 20 5.00 
51 168 20 -75 
i4 25 20 2.96 


TAO 


ot 


| 


Effect of Fluorescent Flicker on Visual Efficiency 


> 
o 
2 
w 
> 
fe) 
w 
fa 
ù 
z 
Ss 
D| 
> 
u 
a 
a 
9| 
= 
iv 
oO 
iTEST 
Fic. 1. CFF drop as a function of illumination. 


cance of the order effect suggested in the curves 
of Fig.1, but this effect did not prove to be 
significant (F = 1.1, P > .05). Analysis of 
variance confirmed the conclusions based on 
t-tests. CFF drop was significantly influenced 
by condition of illumination, but speed of 
reading was not. ` 

At the conclusion of the experimental ses- 
sion, each S was asked whether he had noticed 
a difference in the illumination employed in 
the two periods of work. Of the 20 Ss, only 
five noted a difference, but all five, without 
exception, reported a preference for D.C. 
condition, which came first for two of them 
and second for the remaining three. These 
reports, therefore, tend to confirm the ob- 
jective findings of the experiment. 

These results suggest the undesirability of 


single-lamp ‘A.C.-operated fluorescent fixtures 


which are widely used in the home, if not m 
commercial installations. The present GF 
periment permits no conclusion with respect 
to the adequacy of two-lamp, out-of-phase 
installations such as are commonly used in 
factories and schools, but these fixtures could 
be studied by the technique here employed. 
This experiment provides no basis for evalu- 
ating the spectral properties of fluorescent 
Sources, although again it suggests the kind 
of work which might be done. In order to 


415 


control the flicker factor, D.C.-operated fluo- 
rescent lamps should be used for comparisons 
with other sources. 


Summary and Conclusions 


The experiment here reported was designed 
to study the effects of fluorescent lamp flicker 
upon visual fatigue. Performance of a stand- 
ardized visual task was measured for two 30- 
minute periods under 20 foot-candles of fluo- 
rescent daylight illumination. During one of 
the periods the lamps were operated with 
direct current, while during the other period 
they were operated with alternating current. 
In this way spectral characteristics, brightness 
level, and distribution were constant, with 
flicker being the only variable. Performance 
did not differ significantly under the two 
conditions, but the A.C. condition produced a 
significantly greater drop in critical fusion 
frequency than did the D.C. Only 25 per 
cent of the subjects detected a difference be- 
tween the two conditions of illumination, but 
those that did uniformly expressed a prefer- 
ence for the D.C. condition. 

It may be concluded that single-lamp, or 
in-phase, multiple-lamp fluorescent installa- 
tions are undersirable. Further experimenta- 
tion is needed for the evaluation of out-of- 


phase installations. 
Received December 24, 1951. 


References 


1. Bitterman, M. E. Lighting and visual efficiency: 
The present status of research. ZU. Engng., 
1948, 43, 906-922. 

2, Council on Industrial Health. The effect of flu- 
orescent lighting on vision. J. Amer. Med. 
Assoc., 1945, 128, 1229. 

3. Gray, J. S., and Prevetta, P. Fluorescent light 
versus daylight. J. appl. Psychol., 1950, 34, 
235-236. 

4. Hardy, L. H. Discussion. 
289. 

5. Harmon, D. B. Lighting and the eye. Ill. Engng., 
1944, 39, 481-500. 

6. Holway, A. H., and Jameson, Dorothea. Good 
lighting for people at work in reading rooms and 
offices. Graduate School of Business Adminis- 
tration, Harvard University, 1947, 1-43. 

7. Ley, E. B. Study of illumination. IU. Engng., 
1944, 39, 501-505. 

8. Luckiesh, M. Discussion. 
285. 


Ill. Engng., 1945, 40, 


Ill. Engng., 1945, 40, 


416 Attilio Zaccaria, Jr., and M. E. Bitterman 


~ 9, Luckiesh, M., and Moss, F. K. Vision and seeing 15. Simonson, E., and Brozek, J. Effects of illumina- 


PY -> under light from fluorescent lamps. Il. Engng., tion level on visual performance ‘and fatigue. 
* © 1942, 37, 81-88. J. Opt. Soc. Amer., 1948, 38, 384-397. F 
10. Luckiesh, M., and Taylor, A. H. Radiant energy 16. Simonson, E., and Brozek, J. The effect of spectral 
from fluorescent lamps. TIl. Engng., 1945, 40, quality of light on visual performance and fa- 
77-88. 


tigue. J. Opt. Soc. Amer., 1948, 38, 839-840. 
11. McFarland, R. A., Holway, A. H., and Hurvich, 17. Simonson, E., and Enzer, N. Measurement of 


L. M. Studies in visual fatigue. Graduate fusion frequency of flicker as a test of fatigue of 
School of Business Administration, Harvard Uni- the central nervous system. J, Indus. Hyg. & 
versity, 1942, 191-219. Toxicol., 1941, 23, 83-89. 
12. Morgan, L.D. There is something wrong with our 18. Snell, P. A. An introduction to the study of visual 
fluorescent lighting applications. Jll. Engng., fatigue. J. Soc. Motion Pic, Eng., 1933, 20, 
1945, 40, 275-285. 367-390. i 
13. Ryan, T. A., Cottrell, C. L., and Bitterman, M. E. 19. Travis, R. C., Kennedy, J. L., Mead, L. C., and 
Muscular tension as an index of effort: The effect Allphin, W. Muscle action potentials as a meas- 
of glare and other disturbances in visual work. ure of visual performance cost. JI. Engng.y 
Amer. J. Psychol., 1950, 53, 317-341. 1951, 46, 182-187. 
14. Simonson, E., and Brozek, J. A note on methodo- 20. Wood, C. L., and Bitterman, M, E. Blinking as 
logical evaluation of selected visual tests. Amer. a measure of effort in visual work, Amer. J- 
J. Ophthal., 1948, 31, 979-984. Psychol., 1950, 63, 584-588. 
$ 


| 
| 
| 


Visual Tracking: II. Effects of Brightness and Width of Target? 


Robert S. Lincoln and Karl U. Smith 


å University of 


The problem of the role of sensory cues in 
defining accuracy in visual tracking has both 
theoretical and applied significance. Ele- 
mentary notions suggest that precision in 
tracking would correspond directly to the 
perceptual cue refinement resulting from size, 
brightness and velocity variations in the 
sensory situation. But visual tracking is not 
a simple, direct response to sensory Cues. 
Rather, it is a response pattern composed of 
simultaneous adjustive reactions of position, 
rate, and acceleration control. Furthermore, 
the human operator in the tracking task acts 
as an oscillating generator, whose accuracy 
of response is only partly defined by the 
perceptual aspects of the tracking situation. 
The tracker’s precision, therefore, may not be 
directly related to the magnitudes of stimula- 
tion produced by the sensory characteristics 
of the target and of the cursor. ; 

In regard to these special considerations 
about visual tracking, an attempt has been 
made to determine in fact how tracking ac- 
curacy varies with brightness of target area, 
width of target, and with changes in the pattern 
of the target and cursor. 


Experimental Methods 


1. Design. In this report we, describe a 
series of experiments, all dealing primarily with 
the effect of changes 1 target width upon the 

Three experiments have 


accuracy of tracking. 1 
been performed in order to determine the 


nature of this target-width effect relative to 
variation in target brightness and target-cursor 


Pere | ilized design in 
The fir. ¿periment utilized a C 
Petite e trained an 


which four ject groups wer 

tested on ee target-cursor patterns com- 
bined with targets of different width. The 
subject groups were all tested over the same 
range of variation in target brightness. f 

_ In the second experiment, the effect of varia- 
tion in target width upon the accuracy o! 


funds provided by 
d assigne 
School, 


! This research was su] 

; pported by funi 
tie Legislature of the State of Wisconsin ani 
Un „the Research Committee of the Graduate 

versity of Wisconsin. 


417 


Wisconsin 


tracking was studied with a target and cursor 
of fixed pattern and fixed brightness. This 
experiment was conducted in order to deter- 
mine the character of target-width effects with 
a greater variation in the width of the target. 
The study involved a Latin square design in 
which all subjects were tested under all experi- 
mental conditions. 

The third experiment dealt with target-width 
effects in tracking when cursors of different 
colors were used in combination with a number 
of target widths. This experiment also utilized 
a Latin square type of design in which all 
subjects were tested under all experimental 
conditions. 

The specific conditions of these experiments 
and other pertinent details connected with the 
statistical aspects of design will be mentioned 
in connection with the presentation of the 
results. 

2. Apparatus. The detailed features of the 
apparatus used in these experiments have been 
described previously. The subject's task is 
to track a small target rotating at varying 
speeds. The target moves in both a clockwise 
and counterclockwise direction through a limit- 
ing arc of some 300 degrees. Its radius of 
rotation is approximately nine inches. A cam 
arrangement regulates the direction and veloc- 
ity of target movement. The subject controls 
a handwheel, 18.4 cm. in diameter, located 
approximately 130 cm. from the target plane. 
This handwheel in turn controls a cursor of the 
same radial length as the radius of rotation of 
the target. Cursor and target approximate 
the same plane of movement in order that 
parallax effects may be eliminated, 

Target-cursor pattern is varied by sliding the 
cursor up or down so that either an overlapping 
or vernier pattern can be secured. Variations 
in target width are effected by means of a slide 
arrangement on the target support. 

The error-recording system utilized in the 
present apparatus produces an integrated error 
record that is registered on a clock. Such a 
system weights the subject’s errors according 
to the magnitude of the error and, therefore, 
includes more information in the final trial 
score than would a system which records only 
“on target” time. The error recording adjust- 
ment of the apparatus was kept constant for 
all target widths. Accordingly, the intrinsic 


2 Lincoln, R. S., and Smith, K. U. Transfer of train- 
ing in tracking performance at different target speeds. 
J. Appl. Psychol., 1951, 35, 358-362. 


418 


difficulty of the tracking task remained at a 
fixed level, regardless of target width. 

The mechanism of the integrated error re- 
cording system consists of a hollow brass tube 
that is rotated at a constant speed by a motor. 
A triangular piece is cut from the surface of 
the tube and the tube is filled with sealing wax. 
A mechanical indicator makes continuous con- 
tact with the surface of the tube and the inter- 
mittent electrical contact between the indicator 
and the rotating brass plate of the tube regis- 
ters magnitude of error in accordance with the 
deviations of the indicator from a zero position. 
This time of contact between the indicator and 
and the tube plate is registered on the clock. 

Illumination of the target area is provided 
by a slide projector that is mounted on a 
tripod and enclosed except for the emerging 
light beam. Variation in the brightness of the 


target area is produced by a series of Wratten 
filters. 


Results 


Experiment 1. Effects of Variation in Target 
Width, Target-Cursor Pattern, and Target 
Brightness. In this experiment two target 
widths were used. The width of one target 
subtended a visual angle of 42 minutes while 
the visual angle subtended by the second 
measured only 4 minutes. The targets were 
60 minutes high. The cursor used was 4 
minutes wide and 60 minutes long. Target- 
cursor pattern was varied in two ways. In 
one pattern the cursor completely overlapped 
the target, while, in the second, the cursor and 
target were arranged in a vernier relationship. 
Both target widths were used with both types 
of cursor, making up four separate target- 
cursor combinations. Target brightness was 
varied over four levels which covered a range 
of three log units. 

Twenty-four students, both male and fe- 
male, served as subjects in this experiment. 
The instructions to the subject were: “Try to 
keep the cursor in the center of the target at 
all times.” These instructions were constant 
in all experiments. 

On the first day of the experiment all sub- 
jects received four two-minute trials. Each 
trial involved a different target-cursor com- 
bination. The orders in which the different 
combinations were presented to the 24 sub- 
jects were assigned at random from the total 
of 24 possible orders. On the second day of 


3 Eastman Kodak Company, Rochester, New York. 


Robert S. Lincoln and Karl U. Smith 


the experiment all subjects again received four 
two-minute trials involving the four target- 
cursor combinations, but in a randomly as- 
signed order which differed from that of the 
previous day. The mean scores obtained on 
these first eight trials served as a basis for 
matching the subjects in four groups of six 
subjects each. 

On each of the remaining experimental days 
all of the groups tracked for three two-minute 
trials per day under one of the levels of target 
brightness. At the end of the experiment, all 
four groups had tracked under all four levels 
of brightness. The orders in which the sub- 
jects received the different brightness levels 
were also assigned at random so that each sub- 
ject received a different one of the 24 possible 
orders. Six minutes of dark adaptation were 
required of all subjects before the first trial 
was begun on each of the last four days. Tol- 
lowing the completion of this phase of the ex- 
periment, kymograph records were obtained 
of the adjustments made by some of the sub- 
jects in tracking the narrow and wide targets- 

The accuracy scores obtained indicated that 
the wide target was superior to the narrow 
target when all other conditions were con- 
sidered together. Generally, the vernier CU 
sor was superior to the overlapping cursor- 
Finally, tracking accuracy was observed to Þe 


= 
NARROW TARGET 


O WIDE TARGET 


95 
o 
a 
z 
o 
a 
w 90 
=. 
w 
= os 
o 
a 
> 
& 
a 80 
-i 
3 
o 
< 
Zz 75 
w 
= 
70 


OVERLAPPING VERNIER 


CURSOR TYPE 


Fic. 1. Mean accuracy scores made with different 
cursor types and with different target widths. 


A 


a NANa a 


_ with the narrow target only w 


Visual Tracking 


directly related to the level of target bright- 
ness. An analysis of variance performed on 
these data showed differences between the 
levels of these variables to be significant below 
the .1% point.* 

Additional information concerning the re- 
lationship between target width and tracking 
accuracy is pictured in Figure 1. This figure 
shows that the accuracy of tracking the wide 


target was greater than the accuracy obtained 
hen the cursor 


was in the overlapping position. When the 


cursor was in the vernier position, little differ- 
ence appeared between the targets. This 
interaction was significant below the 1% 
point. 

Further evidence of the complicated nature 
of the relationships existing between the 
variables studied in this experiment 1s shown 
in Figure 2. The interaction represented by 
this figure was also significant below the 1% 
point. The figure portrays the same Te- 
lationship depicted in Figure 1, but in addition 
indicates that even these relationships vary 
with the level of target brightness. At the 
higher brightness levels the wide-target over- 
lapping-indicator combination seems ie be 
superior to all others. However, “t tests 
show that this target indicator combination 1s 
significantly superior to the narrov y-target 
vernier-indicator combination only at the 


brightness level of log 0.37 millilamberts. At 
the lower brightness levels, the vernier indica- 
dless of target width. 


tor was superior regar 
As the figure also indicates, the lowest accuracy 
scores were made throughout the entire range 
of target brightness when a narrow target was 
combined with an overlapping indicator. 
Similar results appeat in an examination of 
the kymograph records obtained at the end of 
this experiment. 
etadinant 2. The Effects of a Wide Range 
of Target Widths, and Variation 1 Cursor 
target widths 


Color. In this experiment six 

were used. The targets were white and sub- 
tended visual angles of 4, 30; 56, 82, 108, and 
134 minutes. Their height subtended a visual 
angle of 58 minutes. The cursor, which was 


4 The critical summaries of data as well as the analysis 
posited with the Ameri- 


of variance tables have been del 
can Documentation Institute. 


419 


ACCURACY SCORE 
(SECONDS) 
WS. 


O—© WIDE- OVERLAPPING 
‘G—V NARROW-OVERLAPPING 
@----© WIDE -VERNIER 
W----V NARROW - VERNIER 


2 “1 o 1 2 


LOGARITHM OF TARGET BRIGHTNESS 
IN MILLILAMBERTS 


Fic. 2. The variation in tracking accuracy as a 
function of level of illumination of the target-cursor 
pattern for an overlapping pattern (open figures) and 
for a vernier pattern (solid figures) and with different 
widths of target. 


kept in the overlapping position, had an 
angular width of 4 minutes and a length of 60 
minutes. In this experiment, it was painted 
a flat red. A high level of target brightness 
was maintained throughout the experiment. 

Subjects were assigned to the rows of three 
randomly drawn 6 X 6 Latin squares, in which 
the columns represented trials and the treat- 
ments were the six target widths. The 18 
subjects were students who appeared on two 
separate days with a one-day interval between 
appearances. On the first day, all subjects 
tracked each of the targets for two minutes. 
On the second day, this pattern was repeated. 
At the end of the second day’s trials all sub- 
jects ran one trial on the 4-minute target and 
one trial on the 30-minute target, this time 
using a white cursor. The order in which 
these targets were tracked was counter- 
balanced between subjects. 

Analysis of the difference in scores made 
with the 4- and 30-minute targets combined 
with the white cursor again showed that ac- 
curacy was greater with the wide target than 
with the narrow target. A “?” test indicated 
that this difference was significant below the 
5% level. Relative to the mean score made 
with the narrow target, this difference repre- 
sents a 2.8 per cent increase in accuracy. 


420 


The scores made with the red cursor showed 
no increase in accuracy for the 30-minute 
target as compared to the 4-minute target. 
Actually, the mean scores made in this situa- 
tion were identical to the nearest one-tenth 
second for these two target widths. Of equal 
interest is the fact that over a range of target 
widths in which the widest target was over 30 
times as wide as the narrowest, the accuracy 
of tracking decreased only 3.5 per cent. This, 
too, is a significant difference, but it is one of a 
` surprisingly small magnitude considering the 
extent of the variation in target width. 

Experiment 3. Retest of the Effects of Target 
Width and Variation of Cursor Color. In the 
second experiment, the color of the cursor was 
observed to have some effect on accuracy when 
only two target widths were used with both 
cursor colors. The third experiment was 
conducted to test the color effect over a 
greater range of target widths. 

Five targets having angular widths of 6, 18, 
30, 42, and 54 minutes were used. These 
targets were painted a flat white and were 58 
minutes high. Two overlapping cursors were 
used with these targets. Both cursors sub- 
tended visual angles of 6 minutes in width and 
were 66 minutes long. One was painted a flat 
white, while the second was painted a flat red. 
In the experiment, each cursor was combined 
with each of the five targets, making a total of 
10 experimental conditions. Target bright- 
ness was maintained at a high level through- 
out the experiment. 

As subjects, twenty students were assigned 
to the rows of two randomly drawn 10 X 10 
Latin squares. All subjects appeared for five 


consecutive days, and on each day tracked for’ 


one minute with each of the 10 target-cursor 
combinations in an order indicated by the 
Latin square design. Scores from only the 
last three days were used in the determination 
of the results, because it was desired to test the 
critical effects on well-trained subjects. 

The results with the red cursor confirmed 
Experiment 2 in showing that no optimum 
target width appeared. For this more re- 
stricted range of target widths, in which the 
ratio of the widths of the narrowest and widest 
target was 1/9, an accuracy decrease of only 
0.9 per cent was observed. 


Robert S. Lincoln and Karl U. Smith 


With the white cursor there appeared a 
relative decrease in accuracy scores of 0.4 per 
cent when the widest target was compared to the 
narrowest. In addition, slightly higher scores 
were made with targets of angular widths, of 
18, 30 and 42 minutes than with the narrowest 
target, although these differences did not meet 
the 5% criterion of significance. 

When all targets were considered together, 
there appeared to be little difference between 
the mean scores made with the red and white 
cursors. 


Summary 


The present study, comprising three experi- 
ments, has disclosed certain significant facts 
about the psychophysical determination of 
error in direct tracking of a target. 

It has been found that the pattern relations 
of target and cursor are related to accuracy of 
visual tracking. A vernier relation of target 
and cursor generally produces more accurate 
performance than an overlapping pattern of 
target and cursor when the color of the tw? 
elements of the visual presentation is the 
same. 

Illumination level of target and cursor 
affects accuracy in tracking. At low illum 
nations, accuracy drops off decisively for a 
cursor pattern which overlaps the target: 
Generally, the effects of change in illumination 
in relation to variations in the pattern © 
target and cursor, as well as alterations in the 
width of the target, are marked by comple* 
interactions which make it difficult to form" 
late simple rules regarding these relations. 

By far the most significant result of this 
study is the observation that an increase 1 
the width of the target produces no marke 
change in the level of tracking accuracy: 
Targets some thirty times the width of the 
controlled cursor are tracked with almost th€ 
Same accuracy as targets equal to the width ° 
the cursor. In fact, under some condition 
maximum levels of accuracy are found wit” 
targets of a width greater than the width ° 
the cursor. These results are secured UP! ee 
conditions in which the criterion of error TES 
mains fixed for all target widths. 

_ The findings just noted have direct applic” 
tion for the design of the perceptual preser A 
tion in tracking devices as well as theoretic? 


i 


Visual Tracking 


implications for understanding tracking be- 
havior. 

The target-width effect, as observed here, 
suggests that the psychophysical laws of track- 
ing behavior deal mainly with phenomena of 
organization of visual pattern rather than 
tolerances of alignment and misalignment of 
visual contours. The tracker’s accuracy is 
determined by precision in scaling or bisection 
of the visual pattern rather than in terms of 
the optical resolution of limiting contours that 
define the visual presentation. 

It may be that the target-width effect is a 
direct product of the character of the tracking 
response. The continued oscillation of the 
tracker about the target restricts visual judg- 
ment to a sustained estimate of relative sym- 
metry of target and cursor position. In this 


421 


sense, the present study constitutes a demon- 
stration that the psychophysical aspects of 
tracking behavior reflect the special charac- 
teristics of variable error control that identifies 
the nature of the tracker’s response. 

This interpretation of the psychophysical 
organization of tracking implies that any cue 
aspect of the tracking situation which does not 
alter the critical oscillatory features of re- 
sponse will not change materially the level of 
accuracy in the situation. Such a systematic 
notion of tracking seems to give a reasonable 
account of some of the other special aspects of 
tracking, especially the phenomena of learning, 
the effects of change in speed of target, and the 
effects of transfer of training in relation to 


target speed.? 
Received January 8, 1952. 


Comment on Darley’s “Special Review” 


Kenneth Eells 


U. S. Naval Personnel Research Unit, San Diego, California* 


In the April, 1952, issue of J. appl. Psychol., 
appeared a “Special Review” of a book, 
Intelligence and Cultural Differences of which 
I was a major contributing author. It is not 

_ clear exactly why this particular book was 

‘singled out for this flattering “special” rather 
than routine reviewing. A search of the back 
files of the Journal reveals only two other such 
instances in the past five years. 

Since the book deals with a large-scale re- 
search study in an important (and contro- 
versial) area, three possibilities immediately 
come to mind: (1) The book possesses unusual 
merit and therefore is worthy of a special effort 
to bring it to the attention of the Journal 
readers; (2) the book is so extremely bad that 
it should be pointed out as a horrible example; 
or (3) the book deals with a controversial area 
and therefore will be of special interest to the 
Journal readers. 

The first of these possibilities can be dis- 
missed quickly, in view of the strongly deroga- 
tory nature of the review. While the writer is 
not in a position to pass an objective judgment 
with respect to the second possibility, it does 
seem odd, if this is what prompted the special 
attention, that the Program Committee of 
Division 5 of the APA selected this particular 
book, and the research which it reports, as the 
subject for a symposium at the recent Wash- 
ington meeting. If the third reason is the 
correct one, the Editor of the Journal is pre- 
sumably at fault in having a controversial 
book reviewed by an avowed opponent of the 
point of view expressed in the book without 
making provision for presentation of other 
points of view. 

It might be, of course, that the pages of an 
APA journal were simply used to present a 
one-sided review of a controversial book and 
that the review was played up by the Editor 

as a “Special Review” to add prestige to the 


* The opinions expressed are solely those of the author 
and are in no way official; nor are they to be construed 
as representing those of the U. S. Naval Personnel 
Research Unit or Bureau of Personnel. 


422 


attack contained in the review. But one 
would come reluctantly to any such conclusion. 
APA journals have had, for the most part, 4 
long and honorable history of fair play and 
scientific objectivity of which they can be 
proud. 

Darley charges the authors of the book with 
lack of dispassionateness and objectivity, 
special pleading, unwarranted assumptions, 
confusion of hypotheses with conclusions, un- 
satisfactory sampling procedures, faulty in- 
terpretation of correlations, improper selection 
of items for study, a research design so planned 
as to maximize the chances of proving the 
hypotheses, inconclusiveness of findings, and 
failure to perform certain possible analyses of 
the data. Not a single significant favorable 
comment is found in the entire review, although 
the book appeared under the joint authorship 
of five individuals connected with reputable 
institutions and was published by a reputable 
university press. 

Probably the most serious charges which 
Darley makes, because they attack the 
scientific integrity of the authors, are those 
dealing with confusion of hypotheses with 
conclusions, unwarranted assumptions, and @ 
research design so planned as to maximize the 
chances of proving the hypotheses. One 
would expect anyone making such charges to 
be scrupulously careful to be accurate and to 
document such statements in detail. Darley 
fails in both respects. 

Space forbids an adequate discussion of 
these points but a few general observations can, 
and should, be made. The writer wishes first, 
however, to apologize for the too-general 
nature of the comments which follow, and for 
his failure to document his statements more 
fully or to develop them to the point where 
the reader could make an intelligent judgment 
about the matters himself. He prepared such 
a statement, taking up each of Darley’S 
charges and citing quotations from the book 
in each case to show that Darley had either 
read carelessly or had misinterpreted what he 


Comment on Darley’s “Special Review” 


had read, had misunderstood the basic nature 
of the research, was being impossibly per- 
fectionist in his expectations, and in one case 
had “slanted” his review by selective citation. 
All this was spelled out with quotations and 
page references to document it fully. The 
Editor ruled that this statement was too long. 

An abbreviated form was then submitted, 
which would have required four journal pages. 
This too was refused publication, even when 
the writer offered to pay the regular page rate 
for adding new pages to the Journal, as is done 
with “early publication” materials. He was 
told ‘Either you will provide a reply within 
1200 words (13 pages) or you will have no 
space at all.” The Editor’s letter openly 
took sides with the reviewer. : 

Scientific understanding can sometimes be 
advanced through scholarly controversy, but 
only when such controversy is based upon 
either carefully documented 3 evidence or 
carefully reasoned analysis. Neither is pos- 
sible within the space limitations which ap- 
parently apply to such controversies in J. 
appl. Psychol. This being the case it is 
particularly important that reviewers give 
well-balanced and objective presentations of 
such controversies. 

Darley’s charge that two of the authors were 
lacking in objectivity and “transmute assump- 
tions and hypotheses into foregone con- 
clusions” reveals that he simply did not read, 
or did not understand, Chapter 1 of the book, 
in which the structure of the book was ex- 
plicitly described. The chapters to which 
Darley refers were there clearly described as 
introductory analysis of the issues for the 
purpose of providing insight into the setting 
of the problem. Any interpretation of these 
chapters as conclusions from the research 
study was explicitly warned against. The 


423 


wisdom of such a presentation may fairly be 
questioned, but to charge that the chapters do 
something which they do not do is something 
else again. 

Space will permit comment on only one of 
the assumptions which Darley refers to: 
“Davis and Havighurst apparently assume 
. . . that all methodological questions in the 
study of stratification have been solved.” 
This is utter nonsense. No one who has 
worked in this area could possibly think that 
all methodological questions have been solved. 
At no point in the book do any of the authors 
say, or imply, any such thing. In no other 
area of research is it customary to imply that 
no research should be conducted until all 
methodological problems have been com- 
pletely solved. Would Darley like to see all 
research in the field of guidance deferred until 
perfect procedures are developed for measuring 
adjustment of individuals? To such extremes 
does a reviewer go when he substitutes zeal for 
destruction for objectivity. 

To charge the authors of a research study 
with having so designed the study as to maxi- 
mize the possibilities of proving their hypothe- 
ses is about as far as one can go in charging lack 
of personal and scientific integrity. Darley’s 
description of what he based this upon reveals 
a complete misunderstanding of the purpose 
of the research study he was reviewing. Un- 
fortunately, to discuss this important point in 
sufficient detail to make it clear to one who 
probably has neither the book nor the review 
before him would require more space than is 
available. 

The writer will be glad to supply a mimeo- 
graphed copy of his more extended statement 
to any interested reader. 


Received August 27, 1952. 
Published out of turn by the editor. 


Reply to Eells’ Comment on Darley’s “Special Review” 


John G. Darley 


Laboratory for Research in Social Relations, University of Minnesota 


The editor of this journal has permitted me 
to make a short rejoinder to the above com- 
ment by Dr. Eells on my earlier review of the 
book of which he was senior author. I doubt 
that this exchange will produce any rap- 
prochement, in view of the complexities and in- 
volvements inherent in research investigations 
of stratification phenomena, but I shall try at 
least to make my position as a reviewer clear. 

A substantial part of Dr. Eells’ comment 
concerns the editorial policies and practices of 
this Journal. Iam obviously not in a position 
to reply to this, nor is it entirely germane to 
the issue as I see it. Iam sure that Dr. Eells’ 
position on my review will be abundantly clear 
to anyone who accepts his offer to receive “a 
mimeographed copy of his more extended 
statement.” I sincerely hope many psycholo- 
gists take advantage of that offer. 

The burden of his present comment regard- 
ing my review appears to be twofold: (1) I 
made no significant favorable comment in my 
review; and (2) I have criticized him and 
particularly his co-authors not only for the 
design and interpretation of his study but also 
for their interpretations and treatment of the 
general problem toward which his research was 
directed as a specific study. 

I know of no commonly accepted canons or 
mores of book reviewing for technical journals 
under which a reviewer is required to make 
favorable comments. I have, rather, operated 


on the philosophy that if a reviewer believes a 
book to be either outstandingly bad or good, he 
is morally obligated to state his carefully con- 
sidered opinions unequivocally. This obliga- 
tion is the greater, to the extent that a book 
deals with a serious or a controversial topic and 
to the extent that great claims are made for 
the book. Psychology should by now be 
somewhat beyond the point in its history as a 
science when it must award many A grades for 
effort only. 

In the last analysis, neither the author nor 
the reviewer represents the final judgment in 
history. Both the author and the reviewer 
express their own best judgments, morally and 
scientifically; present and future readers are 
still free to accept the judgment of one or the 
other, or neither, ‘as they see fit. 

So far as the second phase of Dr. Eells’ com- 
ment is concerned, he has well summarized my 
objections to his work, and I find no reason to 
change them on the basis of his rebuttal. 
While I have no wish to choose up sides in this 
matter, it was of interest to note that 
McNemar’s recent and independent review of 
the same book! raised some of the same 
criticisms that seemed to me worth noting, , 
and in addition raised a few other objections 
that, regrettably, escaped me, 


1 McNemar. 


+Q. Review of Jntelli and Cultural 
Differences. Q wW ol n agence 


Psychol. Bull., 1952549, 370-71. 


424 


Book Reviews 


Dale, Ernest. Planning and developing the 
company organization structure. New York: 
American Management Association. Re- 
search Report Number 20. 1952. Pp. 232. 


$4.50. ($3.00 to AMA members.) 


This book is the result of a two-year re- 
search program sponsored by the American 
Management Association and designed to 
emphasize the realities of organization. The 
research included visits to 40 companies 
of varying sizes; participation in the work of 
changing company organizations; examination 
and analysis of several hundred organization 
manuals and organization charts; a survey of 
American and foreign literature on organization; 
and interviews with many military and indus- 
trial thinkers and practitioners on organization. 

There are two main parts to the book. Part 
One is entitled “The Dynamics of Organiza- 
tion—A Study of Organization Problems at 
Various Stages of Company Growth” and is 
divided into seven sections referred to as the 
seven stages of company growth—(1) deter- 
mining objectives and dividing work; (2) 
delegating responsibility; (3) span of control; 
(4) the staff assistant; (5) the staff specialist; 
(6) group decision-making; and (7) decentrali- 
zation. Part Two is entitled “The Mechanics 
of Organization” and is divided into 11 sections 
under such topics as reorganization; timing, 
defining and constructing the organization; the 
organization chart; the organization manual; 
and gaining acceptance. There are six ap- 
pendices which include job descriptions of 
major executives; descriptions of management 
committees; content and extent of decentrali- 
zation of major management functions; the 
summary of contents of an oil company man- 
agement guide; and organization nomenclature 
of a large automobile company. In addition, 
there is an extensive bibliography grouped for 
each section of the book and also separate sec- 
Hons covering certain selected topics in organi- 
zation. 

Throughout all of the material the author 
Combines what is technical, academic or 
een on the subject of organization with 

at he has found to be business practice and 
Practical application. In all of the main sec- 


tions and in many of the sub-sections he il- 
lustrates the theoretical material with perti- 
nent cases from specific companies. In some 
instances, however, this process of illustrating 
leads him astray since he uses the specific case 
as illustrative of a general principle. Much 
of the material is illustrated with organization 
charts, although the principles and problems 
of designing and constructing organization 
charts are not given the importance which 
they justify in a book on this subject. 

In his discussion of organization problems 
in Part One, the author stresses the desir- 
ability and the growing use of (1) the staff 
assistant, a person who is not an executive, but 
aids an executive in getting things done with- 
out taking away or relieving the executive of 
any of his responsibilities; and (2) the staff 
specialist or consultant. He emphasizes the 
value of the staff specialist in large organiza- 
tions. However, he fails to point out the very 
pertinent fact that individuals who can be 
employed to perform specific assignments as 
the need for them arises are very valuable to 
small businesses where the quantity of work at 
no time justifies the employment of full-time 
persons for specific occasional duties. In a 
highly practical presentation he includes ex- 
cellent discussions of group decision in operat- 
ing organizations along with discussions of 
different types of committees, their functions 
and the strategic factors in successful come 
mittee work. 

f Perhaps the major weakness of the volume 
is that throughout the many pages devoted to 
the concepts of responsibility and authority 
the author does not define them, nor does he 
make any distinction between them. In fact 
it appears that he uses the terms interchan e 
ably, and discusses the delegation of reason 
bility as well as the delegation of authorit : 
os pa concepts are important in orae 
on { p 
i chp discumions Ta trighaseed aaa 
the amount of e ee ideally 
responsibility and authority 
should be equal, that there are i 
differences between th apor ani 
3 : em, and that one of the 
outstanding differences is that authority can 
be delegated whereas responsibility cannot, 


425 


426 


The professional value of this book to psy- 
chologists will be limited largely to those who 
have direct contacts with industry, or who are 
connected with schools of business, or who have 
a specific interest in executive activity and 
organization problems. While the author 
stresses the dynamics and continuity of or- 
ganization planning and the dependence of 
much organizational activity on the personal- 
ities in the organization, his discussions do not 
include any direct references to the theoretical 
psychological principles involved. On the 
other hand, all business executives should be- 
come acquainted with the book. They should 
not read it, however, with the expectation that 
it will solve their organization problems or 
that it will eliminate the necessity of their 
continuing use of expert advice and counsel as 
the problems discussed in the book arise in 
their own business relationships. 


C. G. Browne 
Wayne University 


Von Foerster, Heinz. (Ed.) Cybernetics: circu- 
lar causal and feedback mechanisms in bio- 
logical and social systems. New York: 
Josiah Macy, Jr. Foundation, 1952. Pp. 
240. $4.00. 

This is the third volume on cybernetics 
published by the Josiah Macy, Jr. Foundation 
and is a complete transcript of the eighth con- 
ference on this subject sponsored by the 
Foundation. Although the title does not indi- 
cate it, this conference was devoted almost 
entirely to problems of communication. Six 
papers were presented and thoroughly dis- 
cussed by representatives of the natural and 
social sciences. Of particular interest to 
applied psychologists seriously concerned with 
communications might be the following: Com- 
munication Patterns in Problem Solving 
Groups (Bavelas), Communication between 
Men: Meaning of Language (Richards) and 
Communication between Sane and Insane: 
Hypnosis (Kubie). Communication between 
Animals (Birch), Presentation of a Maze- 
Solving Machine (Shannon) and In Search of 
Basic Symbols (MacKay) are somewhat far 
afield, irrelevant or overly technical for the 
reader without a strong general interest in 
communications per se. 


Book Reviews 


The book is not a collection of answers to 
communication problems or even a report of 
experimentation directed toward that end. It 
is instead a stimulating source of hypotheses 
and suggestions and an interesting illustration 
of the application of widely differing ap- 
proaches to a common problem. The dis- 
cussions are intense and wide-ranging and at 
times seem to get out of hand. The free ex- 
change of ideas, however, seems very valuable 
and is certainly worth the occasional digression. 

Two appendices deal with the nomenclature 
of information theory and references. 


we James J. Jenkins 
University of Minnesota Jad 


Janis, Irving L. Air war and emotional stress- 
New York: McGraw-Hill Book Company, 
Inc., 1951. Pp. 257. $5.00. 

„On November 3, 1944 President Roosevelt 
directed the War Department to investigate 
systematically the role of air power in war and 
peace. The “human factor? was an integral 
part of that investigation by the U. S. Strategic 
pombing Survey in Germany and Japan; 
ag was not included in the scope of the 

After the war, the Air Force, desiring a more 
complete coverage of psychological aspects of 
air warfare, contracted with the Rand Cor- 
poration to review the literature on the sub- 
Ject, including British studies and the pro- 
tocols of the U. S. Strategic Bombing Survey 
reports on Germany and Japan. Prof. Janis’ 
book is the result. In Part I he dispels some 
Popular illusions about reactions to A-bombs 
among the survivors at Hiroshima and 
Nagasaki. There was psychological unpre- 
paredness for the explosion and resultant non- 
adaptive behavior as an immediate reaction. 
Though personal involvement was more wide- 
spread, emotional disturbances were not 
markedly more severe than acute anxiety and 
depression noted among British and raan 
populations after severe raids of conventional 
bombing. Prof. Janis continues “The amount 
of defeatism at Hiroshima and Na asaki was 
less than in other Japanese cities sa P 
The attitudes of the A-bombed population 
were found to resemble those of people in the 
lightly bombed and unbombed cities rather 


Book Reviews 


” 


than in the heavily bombed cities x 
(p. 60) and “the dominant psychological 
effects resulting from A-bomb disaster gener- 


„ally did not differ in any unique way from 


those produced by other types of bombing 
disasters” (p. 71). 

As to reactions to conventional bombing, 
“mental breakdown, panic and mass demorali- 
zation rarely materialized” (p. 153). Chronic 
mental disorders notably increased only among 
those having phobias to loud noises, darkness, 
fire, and danger. Psychosomatic disorders 
were not numerous. Individual and group 
adaptive mechanisms absorbed most shocks 
and emotional reactions to bombing. 

On the other hand, widespread emotional 
stress resulted from deprivation caused by 
bombings. Inadequate food and sanitary 
facilities, breakdown of transportation and 
public utilities, lack of retaliation against 
the continuing air raids all caused bitter re- 
sentment against home authorities for failure 
to provide protection. Lowered morale be- 
came apparent as crime and delinquency rates, 
black-marketing, and other symptoms of 
social disorganization increased. Gradually 
apathy and defeatism developed into defection. 

In conclusion, Prof. Janis warns that it can 
happen here, that realistic education to the 
human factor problems posed by the H-bomb, 
radiation poisoning and biological warfare is 
needed; that there must be a greater aware- 
ness of “group identification as a major factor 
underlying efficient performance in the face 
of danger” (p. 214). 

This is more than a psychologically oriented 
air warfare review of literature and data. It 
is a provocative glimpse into the future. 
Psychologists particularly will find it challeng- 
ing and worthy of study. 


Milton D. Graham 


Human Resources Committee, 
esearch and Development Board, 
Washington 25, D. C. 


Rogers, C.R. Client-centered therapy. Boston: 


Houghton Mifflin Co 951. Pp. vii + 
56 a 1 s P 


= pi } atest volume from the prolific Dr. 
oh A is a systematic and inclusive descrip- 
Ot nondirective therapy, as currently con- 


427 


ceived. Part I presents a “current view,” 
Part II, “the application,” and a briefer Part 
III, “implications for psychological theory.” 
In the “application” unit, Elaine Dorfman has 
contributed a chapter on “Play Therapy,” 
Nicholas Hobbs, a chapter on “‘Group- 
centered psychotherapy,” and ‘Thomas 
Gordon, “group-centered leadership and ad- 
ministration.” Gordon’s chapter is distin- 
guished by its extended and varied refer- 
ences, and by its willingness, at this stage of 
knowledge to accede flexibly to demands of 
practical situations. 

In Client-Centered Therapy Carl Rogers gives 
further evidence of his profound faith in the 
ability of the individual to solve his own 
problems. His present “theoretical” position 
is that the well-adjusted individual is one 
whose self-structure can assimilate and in- 
clude perceived experiences without threat. 
Many individuals, however, seek therapy be- 
cause of the unsuccessful attempt of the self- 
structure to defend itself by rejection of those 
experiences which are inconsistent with the 
individual’s conception of himself. If the 
counselor can indicate to the client unre- 
served acceptance and accurate understand- 
ing of the client’s attitudes and perceptions, 
the client is permitted to become aware of and 
to integrate these previously “denied ex- 
periences” because they become gradually less 
threatening to the self. 

Much of Rogers’ message is buttressed by 
evidence from numerous researches of present 
and former students and by a wealth of illustra- 
tive excerpts which are taken verbatim from 
actual counseling interviews. Out of his con- 
viction and an intimate, empirical acquaint- 
ance with therapy, he evolves 19 propositions, 
constituting the bulk of his last chapter, “a 
theory of personality and behavior.” i 

For the personnel worker, the strength of 
Client-Centered Therapy probably lies in the co- 
herent system which guides the interactions of 
counselor and client: the central “hypothesis” 
of nondirective therapy. Its fallibility ap- 
pears to reside in the fact that many of t 
underlying constructs upon which the | 
pothesis” is based seem to be intuitively 
than empirically derived. For examp 
phenomenal field,” “the self” or “thre: 


428 


put only inferred. One wonders whether 

-~ Rogers’ own basic distrust of empiricism in 

_ data-gathering and theory construction has 
impelled him to state (Proposition VII, p. 494) 
that “The best vantage point for understand- 
ing behavior is from the internal frame of 
reference of the individual himself.” Perhaps 
it is this distrust that prevents Rogers from 
analyzing with meticulous care much of the 
research data which he presents in support of 
his arguments. 

Personnel workers in general, especially 
those engaged in counseling activities, would 
do well to become familiar with the contents 
of this book. Just as Strong and his co- 
workers have contributed materially to our 
knowledge of interests and their measurement 
through a combination of singleness of pur- 
pose with extensive research and writing over 
a period of years, so Rogers and his students 
„have added to our knowledge about counseling 
in the past decade. Careful reading of Client- 
Centered Therapy cannot fail to stimulate the 
personnel worker in the direction of asking 
himself questions about what he is doing, what 
he knows and does not know. In some ways 
Rogers is not unlike Paracelsus who strove 
mightily 450 years ago to answer questions 
about man in his universe with meager data 

d inadequate conceptions available to him. 
As in the case of Paracelsus, Rogers’ latest 
book will probably be enthusiastically ac- 

“cepted by some, violently rejected by others, 
casually received by few. 


Harold B. Pepinsky 


Occupational Opportunities Service, 
The Ohio State University 


Bryan, Alice I., The public librarian (With a 
section on the education of librarians by 
Robert D. Leigh). New York: Columbia 
University Press, 1952. Pp. 474. $6.00. 


This volume completes the series of studies 
of public libraries made by the Public Library 
Inquiry of the Social Science Research Council 
and directed by Robert D. Leigh. It consists 
of Parts 1-3 which contain Dr. Bryan’s survey 
of public library personnel and library person- 
nel organizations and administration; Part 4, 
which is a survey of the education of librarians 
(Dr. Bryan designed the survey and deter- 


Book Reviews 


mined its method, though the execution was 
undertaken by Dr. Leigh); and, Part 5 which 
is the summary and conclusions for both 
surveys. - P 

Dr. Bryan has done a workmanlike job in 
presenting and analyzing the information ob- 
tained from questionnaires and tests given to 
3,706 librarians. In addition she has sup- 
plemented her data in some instances with other 
data. The data on personnel organization 
and administration were obtained by a fifty- 
six page questionnaire supplemented by 
personal visits to forty-three of the sixty 
libraries in the sample. 

In analyzing the professional librarian Dr. 
Bryan studied personal characteristics (age, 
sex, marriage, personality revealed by the 
Guilford-Martin Inventory of Factors, recre- 
ation, etc.), Educational status (academic, 
professional, non-credit training, etc.), Eco- 
nomic status (salaries, promotion, personal 
savings, insurance, retirement, etc.) and the 
Library Career (vocational interests as meas- 
ured by the Strong Vocational Interest Blank, 
present attitudes toward librarianship, at- 
tendance at professional meetings, professional 
reading, etc.). All in all this forms without 
question the most comprehensive picture of 
what the librarian is like, what he (or as Dr. 
Bryan aptly points out she) earns, and how 
well librarians are prepared to meet the 
library’s objectives. It is unfortunate that 


this study appears in 1952, was written just a 
year a 


reader 
the aut 

ht 
Dr. B 


libraries handle their 


gement, employee 
-service training, em- 
orale, etc. present again 
Picture we have of 


ment, With this conclusion, the careful 
reader must agree for Dr. Bryan’s data are 
complete and reliable. One feels at times 
however that she places too much emphasis on 
formal personnel procedures since many of 
the libraries surveyed were small ones (it 
appears that 20 of the libraries averaged less 
than 5 staff members apiece) where formal 
devices are less important than human, com- 
petent and discriminating chief librarians who 
use humanitarian principles in dealing with 
their few staff members. This is in contrast 
to the section on the librarian where in addition 
to Dr. Bryan’s detailed facts one seems to re- 
ceive some feeling of the character and person- 
ality of librarians. 

Dr. Leigh’s section on the education of the 
librarian gives information on the evaluation 
of library schools, their educational programs, 
the students and the faculty and instructional 
resources. The data were obtained by ques- 
tionnaires to the thirty-four accredited library 
schools in the U. S. 

Here again a complete picture is presented 
of library education in the U. S. In addition 


that previous studies of library schools enabled 
the present data to be compared with the 
earlier figures and thus trends could be asses- 
sed. Dr. Leigh’s study however gives more 
complete information than any of the others 
cited for earlier comparison. 

Dr. Leigh’s findings are in brief that there 
are inadequacies in the library schools, which 
are primarily the result of great disparities in 
resources among the schools themselves. He 
suggests that the way of improvement lies in 
the consolidation of facilities in the sub- 
stantial number of strong library schools. 

The two sections present interesting similar- 
ities and interesting differences. Both Dr. 
Bryan’s section on personnel and Dr. Leigh’s 
section on the education of librarians are well 
planned anq well presented, the evidence col- 
lected is pertinent, it is summarized carefully 
and completely, Jn both instances we have 
the most complete picture available today. In 

oth studies there are some interpretations 


at i . . 
are om might question but such instances 


iis i e h > 
in ry © chief differences in the two sections are 
Pect to breadth and depth. Dr. Leigh’s 


Book Reviews 


Dr. Leigh’s study profits greatly from the fact | 


429 


study seems to be broader in that it makes 
much greater use of previous studies both of 
library schools in general and of individual 
schools. Dr. Bryan’s study is almost entirely 
limited to the presentation and analysis 
of the facts obtained. Even in analyzing 
the results she makes little use of other 
articles or books on library personnel ad- 
ministration. 

To this reviewer at least Dr. Leigh’s analyses 
and interpretations seem more penetrating. 
The discussion of library education seems to 
to have much more warmth, much more care 
and thought and more evidence of insight. 
Both however maintain the same high stand- 
ards of the six volumes of the Public Library 
Inquiry previously published and the volume 
as a whole is a valuable addition to library 
literature. 

Errett W. McDiarmid 


Dean, College of Science, Literature and the Arts, 
University of Minnesota 


Riley, John W., Ryan, Bryce F., and Lifshitz, 
Marcia. The student looks at his teacher. 
New Brunswick, New Jersey: Rutgers Uni- 
versity Press, 1950. Pp. iv+166. $2.75. 


This is a report of a single, but extensive, 
study made at Brooklyn College by the Soci- 
ology Department at Rutgers. Of the regu- 
larly enrolled Brooklyn College day students, 
6,681 out of a possible 8,000 answered on one 
day an extensive questionnaire and deposited 
the results in a box anonymously. The book 
analyzes the returns. 

The authors make a telling case a priori for 
the study of evaluation of teaching through 
student opinion. They point out that college 
teachers are well protected by tradition and 
rationalization from facing evaluation of any 
objective kind. They do not have the spur 
to improvement that is furnished by knowledge 
of success or failure. Teachers are selected for 
their training and research but not for success 
asateacher. Too often promotion also is inde- 
pendent of teaching performance. In modern 
large universities, the diagnostic effect of Sae 
dent election of courses is lost. Fellow faculty 
members do not visit each others’ classes and 
depend largely on student gossip for their 
evaluations of each other. 


"430 


Tt is, of course, entirely too much to expect 
that a single study, limited to its advance 
program, should supply the answers to the 
questions raised. Answers can come only as 
a result of long experience and of many ap- 
proaches. Where shall we seek evaluations 
of teaching? From students, from colleagues, 
from past students, from administrators? How 
shall judgments be collected? To whom shall 
they be made available? What is their value? 

The authors have assumed that the answer 
to the question of the best source of evaluative 
judgments is “the students.” There are cogent 
arguments for this. Students are the only 
persons who observe teaching. They experi- 
ence it as well as observe it. Teaching is a 
two-way operation and must be participated 

in by students or there is nothing taught. 

The survey of which the book is a report and 

analysis made use of a questionnaire listing ten 
items important to good teaching. These were 
selected by the staff after reading the literature 
of evaluation and making their own additions 
and talking over the items with students. 
The ten traits were: (1) organization of subject 
matter; (2) speaking ability; (3) ability to 
explain; (4) encouragement to thinking; (5) 
attitude toward student; (6) knowledge of sub- 
ject; (7) attitude toward subject; (8) fairness 
in examinations; (9) tolerance to disagreement; 
and (10) instructor as “human being.” For 
each trait, four degrees were described. In 
the case of trait 5, for example, the extremes 
of the four degrees were described as: (a) 
sympathetic, helpful, actively concerned; and 
(b) distant, aloof, cold. By this device, the 
authors believe they were avoiding encouraging 
ratings on a comparative basis (Professor X 
as compared with Professor Z); they preferred 
to imply an absolute scale. 

Students were asked possible relevant items, 
concerning the size of the class, their own grade 
point, whether the course taught was required 
orelective,andsoon. Each student was asked 
to rate one of his instructors. The one chosen 
depended upon the last initial of the student 
and the serial order of the class in the weekly 
schedule. Findings in individual cases were 
reported in percentiles for each trait, based on 
the distribution (teachers of art, science, and 
social science taken separately). 


Book Reviews 


The authors were not interested in the evalu- 
uation of the teaching of individuals, although | 
each teacher was given the returns in his own 
case, indicating his percentile standing in each. 
trait. No determinations of the reliability of | 
these percentile scores were sought or reported. 

The authors’ main attention is given to the 
association of “traits” with such items as age, 
rank, student grade, student year in college, | 
size of class, and field of study (art, natural 
science, or social science). Importance is at- | 
tributed to the higher scores made by teachers 
on “knowledge of the subject” rather than on | 
“encouragement to thinking,” and to the 
influence from this fact that student judgments 
are in terms of an ideal norm. 

Students’ ratings of instructors are little 
affected by size of class, by the required or 
elective character of a course, or by the year 
in college. Value as a teacher appeared to 
diminish with the age of the teacher and with 
advancing rank. Value is favorably associated 
with the possession of a Ph.D. and by the 
publication of research. 

The report will be valuable to anyone inter- 
ested in the evaluation of teaching but it is 
disappointing in that the study was made 
almost without statistical analysis. The style 
of the book is reminiscent of federal agency 
reports, with the frequent use of such terms as 

implement” to avoid being too precise. 


ya E. R. Guthrie 
University of Washington 


Fuess, Claude M. The College Board: Tts firs! 
fifty years. New York: Columbia University 
Press, 1950. Pp. vi 222. $2.75. 

The development of educational measure- 
mentin the United States has been an outgrowth 
of the insight and skill of many individual 
reasearch leaders, the combined effort of faculty 
members in a number of universities, the work 
of various governmental agencies including 
especially the Armed Forces, and the cont?” 
butions of several institutions specializing i 
tests and examinations. Among Easter col- 
leges and preparatory schools, probably no 
Institution has played a larger part in measure 
ment theory and practice than the Colles? 
Entrance Examination Board. Inrecent yet 


eo {Á 


SE 


Book Reviews 


this influence has been extended to other 
regions as well. 

The College Board by Claude M. Fuess pro- 
„vides an intimate, detailed, and highly readable 
account of the first fifty years of the Board’s 
organization, plans, and work. Perhaps no 
one was better qualified to write a book of this 
kind. As headmaster of Phillips Academy and 
as a leader in the independent school field, 
Dr. Fuess has been closely identified with the 
group of men and women who built the Board, 
a fact which he modestly overlooks in his book. 

Dr. Fuess shows how the idea of a college 
board was born out of the need to correct the 
chaotic college entrance situation that prevailed 
during the latter part of the nineteenth century 
when each college prescribed its own entrance 
requirements without regard to what other 
colleges were doing. He then traces the devel- 
opment of the Board from its early organization 
down to the latest phase when it merged the 
operation of its testing activities with those of 
the Carnegie Foundation for the Advancement 
of Teaching and the American Council on 
Education to form the Educational Testing 
Service. 

The book will be of more interest to educators 
than to psychologists. There are nine chapters 
in the book. The chapter on early organiza- 
tion is a particularly interesting and valuable 
contribution, not alone for its record of the 
formation of the Board, but also for the light 
it sheds upon two great and very different 
personalities in higher education who joined 
forces to establish the Board—Charles William 
Eliot and Nicholas Murray Butler. 

Other chapters deal with the Board in action, 
the move toward comprehensive examinations, 
aptitude tests, problems and personalities, and 
the work of the Board during World War IT 
and following the war. The book concludes 
with a brief chapter discussing the question of 
what lies ahead. In several chapters there is 
considerable detail concerning personalities well 
known in Board circles but perhaps of less 
Interest to a wider reading public. But this 
kind of detail helps to create in the reader a 
feeling for the Board as an on-going organiza- 
tion, 

Although there are occasional references to 

© work of measurement leaders outside the 


È -431 
Board, such as, Thorndike, Bingham, and 
Walter Dill Scott, the development of the 
Board’s examinations is treated largely as a 
self-contained project. In point of time, the 
Board’s history closely parallels the history of 
the measurement movement, and inevitably 
Board policy and the nature of its examinations 
were influenced by the activity and ferment 
taking place in educational measurement at 
all school levels throughout the country. How- 
ever, The College Board was written for a special 
purpose, and it could hardly be expected that 
it would deal with the history of the whole 
field of educational measurement. 

The style of Dr. Fuess’s book, unlike that of 
many books in the field of education, makes 
for pleasurable, as well as profitable, reading. 
The book will be of interest and help to all who 
wish to have at hand a lively and accurate 
record of the first fifty years of an influential 
organization that has served institutions well 
and that may be expected to provide continued 
leadership and service of a higher order in the 
years to come. 

Arthur E. Traxler 

Educational Records Bureau, 

New York City 


Hersey, Rexford. Beller foremanship—key to 
profitable management. New York: Conover- 
Mast Publications, Inc., 1951. Pp. 244. 
$4.00. 

This book is largely a collection of articles 
previously published by the author in Mill and 
Factory, Personnel, and Personnel Journal. It 
is intended for the foreman, “. . . give kim the 
working tools of his profession, a basic concep- 
tion of how to handle them, and a practical 
concept of his day-to-day job.” 

The book deals with the foreman’s duties 
and responsibilities in the role of a leader, a 
practical psychologist, an administrator, a job 
analyst, an instructor, a safety engineer, and an 
executive. The last two chapters are directed 
to executives above the foreman level and deal 
with higher management’s function in develop- 
ing and maintaining better supervisors. There 
is a 56-page appendix consisting of a checklist 
for rating or self-analysis of foremen, and two 
questionnaires dealing with knowledge and 


432 


principles involved in supervision, complete 
with scoring keys. ' : 
The author does a complete job of Covering 
in one volume most of the many and varied 
functions which make up the complex job of the 
present-day foreman. This complete cover- 
age, however, results in one of the main weak- 
nesses of the book. In order to cover so much 
material, the author has had to limit his discus- 
sion of a number of the subjects covered to a 
bare listing of rules-of-thumb for the foreman 
to follow. Many of these rules are stated flatly, 
with little or no explanation of the “why’s” 
behind them. This is especially true in the 
case of the treatment of the “human relations” 
phase of the supervisor’s job. This type of 
presentation creates the impression that a 
simple following of these rules-of-thumb is all 
it takes to become a successful and competent 
foreman. In fact, on page 14 we find the 
statement, “However, the Leader, who has 
followed the techniques laid down previously 


Book Reviews 


in this chapter need not worry about dissatis- 
faction in his group.” 

The book is written in a clear and concise 
style, with occasional examples and illustra- 
tions. Its use with foremen, however, will be 
limited somewhat because much of the vocabu- 
lary used is at too high a level to be understood 


by persons without a fairly extensive educa- 
tional background, 


Because the bo 
fied material—al] 


nd the many rules laid down in 
the book alone could well serve as the basis 
for a profitable series of training conferences. 


Theodore R. Lindbom 
Personnel Department, 


Midland Co-operative Wholesale, 
M inneapolis, Minnesota 


Books, monographs, and pam| 
Department o! 

. . 

Vocabulary, semantics and intelligence. 
thur. Bethesda, Md.: Lexicon Press, 
115. $1.50. 

Social psychology. Solomon Asch. New York: Pren- 
tice-Hall, Inc., 1952. Pp. 736. 

Readings in experimental industrial psychology. Milton 
L. Blum, Editor. New York: Prentice-Hall, Inc., 
1952. Pp. 480. $4.75. g 

The selection of military manpower—a symposium. 
Leonard Carmichael and Leonard C. Mead, Editors. 
Publication No. 209. Washington, D. C.: National 
Research Council, 1952. Pp. 270. $2.50. 

Infant and maternal care in New York City. E. H. L. 
Corwin, Director. New York: Columbia University 
Press, 1952. Pp. 188. $3.50. 

Jim Crow. Jesse Walter Dees, Jr., and James S. 
Hadley. Ann Arbor: Ann Arbor Publishers, 1951. 
Pp. 529. 

Informal groups and the community. Hurley H. Doddy. 
New York: Bureau of Publications, Teachers College, 
Columbia University, 1952. Pp. 34. 

The development of executive talent. M. Joseph Dooher, 

| Editor. New York: American Management Asso- 

| ciation, 1952. Pp. 576. $6.75. 

| Man into wolf. Robert Eisler. New York: Philosophi- 


Snowden Ar- 
1952. Pp. 


cal Library, 1952. Pp. 286. $6.00. 

sonality measurement. Leonard W. Ferguson. New 

Tork: McGraw-Hill Book Co., Inc., 1952. Pp. 457. 

$6.00. å 

The annual survey of psychoanalysis. John Frosch, 

editor. New York: International Universities Press, 

Inc., 1952. Pp. 556. $10.00. 

Behavior difficulties of children. William Griffiths. 
Minneapolis: University of Minnesota Press, 1952. 
Pp. 116. $3.00. 

In search of self. Arthur T. Jersild. New York: 
Bureau of Publications, Teachers College, Columbia 
University, 1952. Pp. 141. $2.75. 

Child adoption in the modern world. Margaret Kor- 
nitzer. New York: Philosophical Library, 1952. 
Pp. 403. $4.50. 

Practical sales psychology- Donald A. Laird and Eleanor 
C. Laird, New York: McGraw-Hill Book Co., Inc., 
1952. Pp. 291. $4.00. 

Principles of human relations. 

New York: John Wiley and Sons, 

474. $6.00. 


Norman R. F. Maier. 
Inc., 1952. Pp. 


phlets for listing and possible review shi 
f Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Conversation and communication. Joost A.M. Meerloo. 


New Books, Monographs, and Pamphlets 


ould be sent to Donald G. Paterson, Editor, 


New York: International Universities Press, 1952. 
Pp. 245. $4.00. 

Psychiatry and medicine. Leslie A. Osborn. New 
York: McGraw-Hill Book Co., Inc., 1952. Pp. 500. 
$7.50. . 

The sensations, their functions, processes, and mecha- 
nisms. Henri Pieron. New Haven: Yale Univer- 
sity Press, 1952. Pp. 469. $6.00. 

Human nature, its development, varialions and assess- 
ment, John C. Raven. London: H. K. Lewis and 
Co., Ltd., 1952. Pp..226. 12s. 6d. 

History of American psychology. A. A. Roback. New 
York: Library Publishers, Inc., 1952. Pp. 426. 
$6.00. 

The dynamics of the counseling process. Everett L. 
Shostrom and Lawrence M. Brammer. New York: 
McGraw-Hill Book Co., Inc., 1952. Pp. 213. $3.50. 

Recreation leadership. Walter L. Stone and Charles G. 
Stone. New York: The Wiliam-Frederick Press, 
1952. Pp. 81. $2.00. 

Readings in social psychology. Revised edition. Guy 
E. Swanson, etal. New York: Henry Holt and Co., 
1952. Pp. 680. $5.00. 

The psychology of thinking. W. Edgar Vinacke. New 
York: McGraw-Hill Book Co., Inc., 1952. Pp. 392. 
$5.50. 

America’s manpower crisis. Robert A. Walker, Editor. 
Chicago: Public Administration Service, 1952. Pp. 
191. $3.00. 

How to prepare and use job manuals. Marguerite Hol- 
brook Watson. New York: The William-Frederick 
Press, 1952. Pp. 38. $1.00. : 

Society and personality disorders. S. Kirson Weinberg. 
New York: Prentice-Hall, Inc., 1952. Pp. 544. 
$5.75. 

Group leadership. Gunnar Westerlund. Stockholm: 
Sweden: Nordisk Rotogravyr, 1952. Pp. 257. 16 
Swedish Crowns. 

Personnel principles and policies: modern manpower 
management. Dale Yoder. New York: Prentice- 
Hall, Inc., 1952. Pp. 704. 

The British worker, Ferdynand Zweig. Baltimore: 
Penguin Books, Inc., 1952. Pp. 240. $.65. i 

Population and its distribution. J. Walter Thompson 
Company. New York: McGraw-Hill Book Co., Inc 

z 1932; RD: 428. $15.00. e ei 
ersonn: ministration through supervisors. 
reading list. Pasadena: Industrial Tne ae 
tion, California Institute of Technology, 1952. Pa a 


433 


